Releases: hosseinmoein/DataFrame
Oct-2025
Added collapsing menus to docs and improved content and visuals
Fixed an edge case (\r in strings) when reading csv2 files
Implemented sort_freq() function
Implemented writing data in pretty printing
Implemented writing data in markdown format
Implemented writing data in latex table format
Implemented writing data in html table format
Implemented load_random_sample() function
Implemented resample() function
Implemented DivideToBinsVisitor visitor
Implemented DivideToQuantilesVisitor visitor
Implemented the pipe() function
Implemented fl_valid_index() function
Implemented permutation_vec() function
Implemented get_data_every_n() function
Implemented get_n_largest_data() function
Implemented get_n_smallest_data() function
Implemented is_nan_mask() function
Implemented is_infinity_mask() function
Implemented is_defualt_mask() function
Implemented SkewVisitor visitor
Implemented KurtosisVisitor visitor
Added get_mean() to CovVisitor and a bunch of other visitors
Implemented CategoryVisitor visitor
Implemented CoeffVariationVisitor visitor
Added inclusiveness enum class and added it as a parameter to a bunch of slicing and other functions with ranges
August-2025
Improved documentation by creating split windows
Added cheat-sheet to docs
Fixed some edge case bug in parallel sort
Implemented detect_and_fill()
Implemented detect_and_change()
Implemented KolmoSmirnovTestVisitor visitor
Implemented MannWhitneyUTestVisitor visitor
Implemented mask()
Added many more functionalities to the internal Matrix class
Implemented fast_ica()
Fixed the inconsistency in writing/reading DateTime columns to/from files
Added ability to read only selected columns from files
Implemented MutualInfoVisitor visitor
Added ability to specify a delimiter when writing/reading to/from csv files.
Implemented AndersonDarlingTestVisitor visitor
Implemented ShapiroWilkTestVisitor visitor
Implemented CramerVonMisesTestVisitor visitor
Implemented unpivot()
Implemented pivot()
April-2025
Enhanced documentation
Implemented SpectralClusteringVisitor visitor
Enhanced ThreadPool parallel_loop()
Implemented get_[data|view]_by_spectral()
Implemented determinant() + a bunch of other stuff in Matrix
Implemented canon_corr()
Implemented MC_station_dist()
Implemented SeasonalPeriodVisitor visitor
Improved performance in reading files of different types
Changed read() signature to take a struct for its parameters -- backward incompatible change
Changed write() signature to take a struct for its parameters -- backward incompatible change
Implemented ability to read() csv2 files with user provided schema
Implemented knn()
Implemented DynamicTimeWarpVisitor visitor
Implemented AnomalyDetectByFFTVisitor visitor
Changed the interface of HampelFilterVisitor -- backward incompatible change
Implemented remove_data_by_fft()
Implemented AnomalyDetectByIQRVisitor visitor
Implemented AnomalyDetectByZScoreVisitor visitor
Implemented remove_data_by_iqr()
Implemented remove_data_by_zscore()
Implemented AnomalyDetectByLOFVisitor visitor
Ported to GCC14 compiler and fixed many edge-case bugs
Jan-2025
Improved documentation and code quality
Fixed a bug in assign()
Implemented get_[data|view]_by_kmeans()
Changed interface and optimized code in AffinityPropVisitor (backward incompatible change)
Implemented get_[data|view]_by_affin()
Added option to HampelFilterVisitor to populate indices to datapoints affected
Implemented remove_data_by_hampel()
Implemented MeanShiftVisitor visitor
Implemented get_[data|view]_by_dbscan()
Impelmented get_[data|view]_by_mshift()
Improved performance in remove_duplicates()
Added FixedSizeString as one of the types that can be read/written from/to files
Added a stable_algo option to covariance and ... visitors to use a numerically stable algo instead of regular algo
Implemented a Matrix class to be used for internal calculations and analysis results
Implemented CrossCorrVisitor visitor
Optimized the implementation of AutoCorrVisitor
Implemented PartialAutoCorrVisitor visitor
Added max_lag parameter to AutoCorrVisitor
Implemented make_stationary()
Implemented StationaryCheckVisitor visitor
Implemented covariance_matrix()
Implemented pca_by_eigen()
Implemented compact_svd()
Oct-2024
Improved documentation both visually and content-wise
Changed NLargestVisitor to take N as constructor parameter instead of template parameter (backward incompatible change)
Implemented get_top_n_[data|view]()
Implemented get_bottom_n_[data|view]()
Implemented get_above_quantile_[data|view]()
Implemented get_below_quantile_[data|view]()
Added period parameter to ReturnVisitor visitor
Implemented starts_with()
Implemented ends_with()
Implemented CumCountVisitor visitor
Implemented in_between()
Implemented peaks()
Implemented valleys()
Made reading/writing large files faster
Implemented apply()
Made replace() faster with better algorithm
Implemented truncate()
Implemented a version of load_column() with functor generating data
Implemented explode()
Implemented reading/writing std::pair columns from/to files
Added more sanity checks
Implemented difference()
Implemented get_[data|view]_at_times()
Implemented get_[data|view]_before_times()
Implemented get_[data|view]_after_times()
Implemented get_[data|view]_on_days()
Implemented get_[data|view]_in_months()
Implemented get_[data|view]_on_days_in_month()
Implemented get_[data|view]_between_times()
Implemented remove_top_n_data()
Implemented remove_bottom_n_data()
Implemented remove_above_quantile_data()
Implemented remove_below_quantile_data()
Implemented remove_data_by_stdev()
Implemented get_[data|view]_by_stdev()
July-2024
Enhanced documentation and code clean ups
Converted DateTime doc to html
Implemented PeaksAndValleysVisitor visitor
Implemented EhlersHighPassFilterVisitor visitor
Implemented EhlersBandPassFilterVisitor visitor
Implemented reading/writing in binary format
Implemented reading binary data format in chunks
Implemented serialize() and deserialize()
Implemented reading/writing containers in binary format
Added optional time-zone to strings parsed by DateTime constructor
Implemented PowerFitVisitor visitor
Implemented QuadraticFitVisitor visitor
Implemented fill_policy::lagrange_interpolate
Implemented correlation_type::kendall_tau
Implemented change_freq()
Implemented duplication_mask()
May-2024
Significantly enhanced documentations both content-wise and visually
Fixed a few edge-case bugs, including an edge-case in reading CSV2 format files
Factored out and cleaned code
Implemented inversion_count()
Implemented get_[data|view]_by_like()
Implemented remove_data_by_like()
Added char and uchar type to types read/written from/to files
Added ability to read/write columns of containers from/to files
remove_column() now requires a template parameter. It actually frees up the memory space now
Implemented clear()
Implemented swap()
Now using some of the std::ranges algorithms
Added scaler arithmetic DF operators
Added magnitude calculations to DotProdVisitor visitor
Added Euclidean distance calculations to DotProdVisitor visitor
Added Manhattan distance calculations to DotProdVisitor visitor
Implemented VectorSimilarityVisitor visitor
Replaced asserts in algos with exceptions and added a compile-time option for it (HMDF_SANITY_EXCEPTIONS)
Partially reengineered views so now you can use most of the API from views
Added sentinels to vector views iterators
Feb-2024
multithreading was completely redesigned by using a versatile thread-pool. Almost every API has a multithreaded version that kicks in for large datasets. This justifies increasing the major version number
Added a thread-pool.
All Async calls now use the thread-pool.
Sort now uses parallel sort for large datasets.
Added multithreading to almost all algorithms.
Enhanced docs and hello world.
Dec-2023
This release requires C++23 or higher.
Added more content to documentation.
Made reading/writing files more streamlined and efficient.
Fixed a bug in Median and Kth_element visitors related to handling nans.
Added ability to read/write String Vectors, Double Sets, and String Sets as column elements in CSV2 format.
Added seed option to all algorithms that use random numbers.
Implemented PriceVolumeTrendVisitor visitor.
Implemented QuantQualEstimationVisitor visitor.
Fixed RSIVisitor visitor result to be the same size as its input.
Implemented get_str_col_stats().
Added get_euclidean_norm() to QuadraticMeanVisitor visitor.
Added different normalization-types to NormalizeVisitor visitor.
Added more benchmarking comparing with Pandas and Polars
Made sorting much faster by using ranges and zip.
Oct-2023
Added more content to documentation and Hello World example.
Fixed a bug in join that missed multiple matches in some edge cases
Fixed a bug/edge case in Covariance calculation.
Fixed a bug in reading JSON files.
Utilized meta-programming in several parts of the codebase, especially visitors.
Added a whole lot of C++ concepts throughout the code.
Fixed many const-correctness throughout the code.
Added a mechanism for a lot of visitors to be used in groupby and bucketize.
Now in csv2 format you can read/write columns of vector, map, and unordered map types
Enhanced DateTime ISO format parsing.