pypi saspy 3.6.0

latest releases: 5.4.4, 5.4.3, 5.4.2...
3 years ago

This release has a number of enhancements for df2sd (dataframe2sasdata). This started as performance changes for #326 and included more for #332. The main change had to do with calculating the lengths for char columns of the dataframe, which has to be done to declare the correct byte lengths for the corresponding SAS variables in the SAS data set being created. With a DF having 150 million rows and 100 char columns, this step was taking way too long. I separated out this step from df2sd (df_char_lengths()) so it can be called independently (by the user or by the access methods df2sd), returning a dict with the char column names and lengths. I also made enhancements to this routine to be able to shortcut some of the time calculating lengths so it could be quicker. df2sd can take these options for when it calls this internally, but it can also take a dict with the char column names and lengths (that is returned by that method, or you can just code that yourself so that the metadata calculation step can be done once, or skipped altogether and just go to the data transfer. I also enhanced the data transfer step in the STDIO access method significantly too. Handling transcoding failure is now handled in the data transfer step (though it can still be caught in the length calc routine if wanted), and you now have the option of replacing chars that can't be transcoded, with the replacement char, instead of failing. So there's a lot of new functionality and performance improvements that can be tapped into in this version for df2sd. The default behaviors, for the most part, are still the same as they were. So if df2sd seems too slow, there are a number of ways to improve it's performance in this version, by tweaking these options.
Oh, and I almost forgot, df2sd also now has an outdsopts={...} parameter which allows you to specify key=value output data set options for the data set being created: for instance, compress=, encoding=, index=, outrep=, replace=, rename= ...

Don't miss a new saspy release

NewReleases is sending notifications on new releases.