- Hed
- Mar 31, 2004
-
-
Fun Shoe
|
Cross-posting since this is more data than Python (thanks WHERE MY HAT IS AT)
Is there a standard tool to look at parquet files?
I'm trying to go through a slog of parquet files in polars and keep getting an exception:
Python code:Traceback (most recent call last):
File "log_count.py", line 57, in <module>
daily_output = result.collect()
^^^^^^^^^^^^^^^^
File "venv\Lib\site-packages\polars\lazyframe\frame.py", line 1937, in collect
return wrap_df(ldf.collect())
^^^^^^^^^^^^^
polars.exceptions.ComputeError: not implemented: reading parquet type Double to Int64 still not implemented
I know what this means, but I don't have a good way to diagnose what errors the files are in, and so end up moving groups of files around until it works, then putting them back in one by one until I find the offender.
I understand I'm trying to have the efficiency of polars in lazy mode, but I'd love to know where it specifically blows up to help figure out the problem upstream.
Is there a better place to ask polars / data science questions?
|
#
¿
Feb 21, 2024 21:26
|
|
- Adbot
-
ADBOT LOVES YOU
|
|
#
¿
May 18, 2024 05:40
|
|