Data For ironmussa/Optimus

Only showing last 50 predictions


# Title Body Link Prediction Confidence Labeled?
950 [FEA] Implement Optimus parser For convenience, we want to implement a parser that let the user create and string expression... feature_request 0.99 True None None
949 [] **Is your feature request related to a problem? Please describe.** A clear and concise... feature_request 0.72 True None None
948 [FEA] Implement transformation funtions that not modify the dataframe Optimus was designed to transform the data inside the dataframe, but nothing was done to process... feature_request 0.97 True None None
781 Fix simple typo: ouput -> output # Issue Type [x] Bug (Typo) # Steps to Replicate 1. Examine optimus/ml/encoding.py. 2. Search... bug 0.99 True None None
760 Outliers module not working I have tried to chain or import outliers with many different methods. I have tried importing... bug 0.89 True None None
741 Explore new imputation methods https://towardsdatascience.com/6-different-ways-to-compensate-for-missing-values-data-imputatio... feature_request 0.89 True None None
728 Add Dask as Backend WIP here https://github.com/ironmussa/Optimus/tree/feature/new-api-refactoring - [ ] Create... feature_request 0.98 True None None
716 levenshtein_cluster change Name please change ml/distancecluster.py line number line number 114 in method levenstine_cluster... feature_request 0.54 True None None
709 Refactor IO package @argenisleon This is still WIP, but I would like to get some initials thoughts and see if you... feature_request 0.86 True None None
684 Create examples for EMR, dataproc and databricks Because some user has had problems configuring these services could be helpful to make some... feature_request 0.96 True None None
680 pickle error : ModuleNotFoundError: No module named 'fastnumbers' Optimus 2.2.13 fuctionality throws error : > op.profiler.to_json(df, '*') even though... bug 0.76 True None None
669 Enhance the profiler to detect currencies Detecting currencies could be helpful https://www.toptal.com/designers/htmlarrows/currency/ feature_request 0.99 True None None
666 Add support for connecting to DB2 feature_request 0.94 True None None
664 getting ava.lang.IllegalArgumentException: Unsupported class file major version 56 on reading back written parquet getting ava.lang.IllegalArgumentException: Unsupported class file major version 56 after reading... bug 0.5 False None None
663 Add geospatial features to optimus can we have geopandas like features in optimus? feature_request 0.92 True None None
662 Separate Enrichment features Enrichment features seem to be helpful but not very popular. Also Its couple with pymongo that... feature_request 0.78 True None None
645 Add ability to create Optimus DF from Spark DF As far as I can tell, functionality to create an Optimus DataFrame from an existing Spark... feature_request 0.99 True None None
644 Adding function to handle cols and dataframes metadata feature_request 0.88 True None None
642 urllib.error.URLError: <urlopen error unknown url type: ['hdfs'] when i provide the HDFS url while reading a parquet file in HDFS in remote system. I am getting... bug 0.75 True None None
641 Profiler fails when datetime or numeric types are detected When running the profiler with `infer=True` with actual columns having datetime and integer... bug 0.92 True None None
639 Python 3 No module named '_ssl' when pip install optimuspyspark into python package I install python3.7.3 via downloading python source code and compiling, configure, and finally... question 0.6 False None None
638 Add support for connecting Cassandra https://github.com/datastax/java-driver feature_request 0.95 True None None
633 Using pypika to improve JDBC queries feature_request 0.88 True None None
632 Add extra type Profiler can detect at the moment: * String * Integer * Decimal * Boolean * Array *... feature_request 0.99 True None None
627 Add support for connecting Presto query engine feature_request 0.97 True None None
622 Implement stratified sampling feature_request 0.97 True None None
616 Can not connect to postgres - No suitable driver found I am running into the following problem while connecting postgres using... bug 0.68 True None None
610 Create a docker container to test hdfs file loading and saving feature_request 0.94 True None None
602 Profiler can not handle columsn wit When running the profiler with ``` df = op.create.df( [ ("names", "str",... bug 0.94 True None None
600 Create docker compose to run db containers feature_request 0.92 True None None
599 Inline css in table() Inline the css in the table so it can look correct when not rendered ouside a notebook feature_request 0.58 True None None
593 Create tests for the example notebooks feature_request 0.91 True None None
590 Give additional guidance to the user about how to configure env vars Setting env vars, for example, SPARK_PYTHON, PYSPARK_DRIVER_PYTHON, SPARK_HOME generate a lot of... feature_request 0.92 True None None
585 Explore API standarization Profiler, Keycollision, Outliers. JDBC does not have a standard way to be used. feature_request 0.76 True None None
584 Key collision functions do not handle multiple cols correctly bug 0.82 True None None
583 Is there a way to disable caching of dataframe. I have a dataframe. when i do > kc.fingerprint_cluster(df, columnname) then i change some... question 0.63 True None None
581 Profiler returning unique_count value wrong Reported by @pallav1991 : I have one more issue that there at times when i have seen the... bug 0.95 True None None
571 SyntaxError: unexpected EOF while parsing Hi Team - When I try to execute following command on my dataset... bug 0.8 True None None
566 [BUG].table() does not work in databricks bug 0.87 True None None
565 OverflowError: signed integer is greater than maximum Hello @argenisleon / @FavioVazquez - I started using latest release version of library. But... bug 0.77 True None None
563 [Enhancement] Can we provide schema as an option for JDBC Currently in JDBC connection schema is set to default "public". Can we provide schema name as an... feature_request 0.95 True None None
562 [BUG] JDBC Redshift connection url. Port not configured. While creating the JDBC redshift connection url, port is not getting added in the url. Port is... bug 0.97 True None None
560 Slow speed Hello @argenisleon / @FavioVazquez - I am running couple of operations like mentioned below... question 0.69 True None None
558 installation feature_request 0.68 True None None
557 JDBC Connectivity Hi Team - I am trying to dump my dataset into a PostgreSQL and read from it within a Jupyter... question 0.78 True None None
554 Exception Handling Hi Team - I am facing an issue with running profiler on a datetime column in my dataset.... question 0.47 False None None
552 Read from tab separated files Hi Team - I started using the library and began with loading my datasets which are in... feature_request 0.58 True None None
551 Issue with installing optimuspyspark Hi Team - I am in the middle of installing optimuspyspark package to experiment with data... question 0.54 False None None
547 I am getting below exception while reading csv file and writing it to parquet file Today After Upgrade when I ran my Program I am getting below exception. An error occurred while... bug 0.59 True None None
544 Unable to load csv from hdfs I am uploading the file from hdfs with below code snipit please check. This is throwing and... bug 0.8 True None None