github OpenRefine/OpenRefine 3.8-beta1
OpenRefine v3.8-beta1

latest releases: 3.8.0, 3.8-beta5, 3.8-beta.4...
pre-release2 months ago

This is the first beta release of the 3.8 series. Please backup your workspace directory before installing and report any problems that you encounter.

New features and improvements

Keyboard navigation improvements

  • Dialogs are focusable (#5578, by @Abbe98)
  • The tab order for reconciliation create and match buttons is fixed (#5685, by @Abbe98)
  • The show/hide left panel is keyboard accessible (#5852, by @Abbe98)
  • Menu buttons in the extension-bar can be opened with the keyboard (#5853, by @Abbe98)
  • Tab buttons are outlined when focused (#5851, by @Abbe98)
  • The element outlines set by the browser are retained (#5867, by @Abbe98)
  • The project permalink can be selected (#5871, by @Abbe98)
  • The button to rename a project can be selected (#5868, by @Abbe98)
  • The column header buttons are selectable/openable by keyboard (#5854, by @Abbe98)
  • Make action area tabs selectable by keyboard (#5885, by @Abbe98)
  • Make it possible to set custom headers using only the keyboard when fetching urls (#5886, by @Abbe98)
  • The menu system is navigable by keyboard (#5901, by @Abbe98)
  • The "Get data from" menu is keyboard accessible (#5900, by @Abbe98)
  • Cells in the grid can be edited by keyboard (#5855, by @Abbe98)

Reconciliation usability improvements

  • The waiting screen while guessing reconciliation types is internal to the reconciliation dialog (#4877, by @ayushrai206)
  • The "auto-match" checkbox persists after restarts of OpenRefine (#4722, by @ayushrai206)
  • The documentation of the reconciliation service is displayed in the reconciliation dialog (if available) (#5784, by @ayushriai206)
  • The default types supplied by the reconciliation service are always offered to users (#4224, by @ayushrai206)
  • The reconciliation types are displayed with both name and id (#5907, by @ayushrai206)
  • OpenRefine honours the batch size announced by reconciliation services (#5603, by @ayushrai206)
  • The dialog for the operation to add a column of entity identifiers is improved (#5998, by @elebitzero)
  • Errors encountered by the reconciliation operation are displayed in the grid and are available via the cell.recon.error GREL expression (#3194, by @ayushrai206)
  • Those errors can also be isolated via facets (#6232, @ayushrai206)
  • The interface to select a reconciliation service when reconciling was improved (#6118, by @ayushrai206 and @Lydiaofficial)
  • The "Search for match" option is present in cells with reconciliation errors so that they can be fixed manually (#6192, by @ayushrai206)
  • The error messages generated during reconciling are more helpful (#6111, by @ayushrai206)
  • A new operation to extract URLs for reconciled cells is available (#5960, by @ayushrai206)
  • Property selection in the reconciliation dialog gives better feedback to the user about whether a column is successfully mapped to a property or not (#6060, by @elebitzero)
  • Type selection is similarly improved (#6131, by @elebitzero)
  • Only up to three reconciliation candidates are displayed by default, with the option to see more (#6154, by @ayushrai206 and @Lydiaofficial)
  • A new design for the reconciliation dialog was proposed but has not been implemented yet. Your opinion about it is welcome on the forum. By @Lydiaofficial
  • It is possible to discover the source of a column obtained by fetching data from a reconciliation service, by hovering the column header (#5130, by @ayushrai206)

Facet improvements

Improvements to linking to specific parts of OpenRefine via URLs

  • It is possible to link to a given home screen panel (#5597, by @Abbe98)
  • The tags used to filter projects are also reflected in the URL (#5769, by @Abbe98)

Layout improvements to the Wikibase extension

Other improvements

  • Applying a list of operations stored in a JSON file is possible without copy and paste (#5022 by @IjayAbby)
  • The cluster choice limit is configurable via the preferences (#5847, by @5tigerjelly)
  • The metadata dialog shows JSON contents in a more readable way (#5870, by @Abbe98)
  • OpenRefine is known to be compatible with Java versions 11 to 21 (#5930, by @wetneb)
  • URIs with the geo: protocol are rendered as links in cells (#5940, by @Abbe98)
  • Project archives can be imported via URL (#5431, by @SoniaSun810)
  • A Windows installer is available on top of the existing zip distribution. We no longer publish zip distribution without embedded Java. Do let us know if this is a problem for you. (#3224, by @dori4n)
  • CSV Byte Order Mark (BOM) is supported (#1241, by @tfmorris)
  • More HTTP headers are supported when fetching a column via URLs (#6334, by @tfmorris)
  • Columns can be expanded with a more easily identifiable button (#5879, by @VhugoJc)
  • TSV export avoids adding unneccessary quotes around cells (#2071, by @tfmorris)

GREL changes

  • The forRange GREL function accepts negative increments (#5520, by @Huishin-pie)
  • Accessing the record & columnNames fields works again (#5633, by @tfmorris)
  • The split() function filters trailing empty token when there is a trailing string separator and leading empty token when there is a leading pattern separator match (#5587, by @tfmorris)
  • The replaceEach() function is more faithful to its documentation (#5463, by @Huishin-pie)
  • The cross() function returns an empty list on no match (#5531, by @jenny-Musah)
  • The length() function returns the number of keys in object (#5991, by @tfmorris)
  • The forEachIndex() control supports JSON objects and arrays (#3147, by @tfmorris)
  • Controls (+ and comparison operators such as <) have less unexpected behaviors (#6340
    , #6341, by @tfmorris)

Bug fixes

  • The reconciliation dialog does not suggest properties specific to a type when "No Type" is selected (#5523, by @tfmorris)
  • The word facet uses an internationalized word separator instead of just space characters (#557, by @tfmorris)
  • The HTTP proxy configuration is correctly handled (#5476, by @tledoux)
  • The 'Facet choices' dialog does not bleed over the dialog window on resize. (#5619, by @elebitzero)
  • The memory usage display shows up correctly on project creation (#5665, by @elebitzero)
  • The readability of the no-projects message is improved (#5679, by @Abbe98)
  • The visual style of the database import panel is more consistent with the rest of the application (#5548, by @Abbe98)
  • Dialogs cannot be dragged past the top of the window (#5714, by @elebitzero)
  • The clipboard import input does not overflow in narrow windows (#5753, by @Abbe98)
  • File upload buttons have a style that is consistent with other buttons (#5743, by @Abbe98)
  • Errors in preference changes are properly reported (#5772, #5785, by @tsukipedia and @tfmorris)
  • Numeric facets are properly refreshed when switching tabs (#5781, by @elebitzero)
  • Manages importing URL with illegal characters. (#4625, by @yeungven)
  • The auto-completion for fields in the Wikibase schema conforms to the new MediaWiki API (#5716, by @SoniaSun810)
  • When removing rows, the cache of facet counts is correctly updated, updating the duplicates facet as well (#5799, by @tfmorris)
  • Column names are correctly quoted in the SQL exporter (#5388, by @mahikaajain)
  • The star/flag action correctly updates the row without reverting reconciliation changes (#5738, by @SoniaSun810)
  • The pointer cursor only changes when hovering the column menu, not the entire column headers (#5977, by @Abbe98)
  • The "Use quote as separator" defaults to False for TSV import (#3853, by @jnchen1)
  • The "Transpose cells across columns" treats blank cells like null cells (#5229, by @skhoylow8)
  • Wikibase edits on deleted items are skipped (#5385, by @wetneb)
  • The OpenRefine launcher on Windows no longer refuses to start if the Java version is too high (#5583, by @wetneb)
  • Reconciliation services can no longer be added multiple times (#5926, by @tsukipedia)
  • The SQL importer no longer checks for particular keywords in the query (#6019, by @tfmorris)
  • The allowed characters in file names of media uploads to Wikibase file are extended (#5656, by @santi4o and @wetneb)
  • Case sensitive sort of rows works as expected (#6047, by @tfmorris)
  • Missing spaces in facet sorting and "add column based on reconciled values" dialog are back (#6047, #6143, by @frafra and @SrinathKadam048)
  • The URL fetching operation returns an error, not null, on a bad URL (#6137, by @tfmorris)
  • Bzip2 import works again now supports concatenated compressed streams (#6129, by @tfmorris)
  • Dates on Open Project page are localized (#6172, by @tfmorris)
  • Text in alert dialogs is selectable (#6187, by @elebitzero)
  • The grid is properly updated when a cell is matched (#6236, by @elebitzero)
  • The message shown at the top of the screen after operations has less encoding errors (#6063, by @tfmorris)
  • Setting the log level from the command line works again (#6286, by @amparab)
  • Creating a project from a URL with trailing whitespace no longer fails (#6330, by @surajbora59)
  • A spurious warning about missing units in the Wikibase extension was removed (#5452, by @payalsaraljain)
  • The ./refine script ignores HTTP proxies for querying OpenRefine directly (#2000, by @tejasbhosale17)

Performance improvements

  • We are now using the uniVocity CSV parser instead of the Apache Commons CSV parser. Beyond the performance improvements this brings, it is likely that this change comes with different parsing behaviour in some cases. Do let us know if those seem to be regressions (#2268, #1372, by @tfmorris)
  • Project initialization is faster by sending parallel requests from the frontend (#5941, by @Abbe98)
  • The metadata files of projects are only written when needed (#3805, by @ComgLq24)
  • Newly read projects are not written before they're modified (#3805, by @tfmorris)
  • Jython interpeter is not initialized during startup but only the first time it is used (#6174, by @tfmorris)
  • The clustering dialog no longer runs the default clustering method by default to avoid unnecessary heavy computations on large projects (#241, by @elebitzero)

For developers

  • The timestamps for project changes have been migrated to UTC time objects (#3047), Switch HistoryEntry to Instant from OffsetDateTime(at UTC) (#6176)
  • We migrated from using .less to .css for our stylesheets. Extensions should still be able to use .less but are encouraged to migrate to using CSS variables instead (#5525). This is part of an ongoing effort to offer a dark mode (#3017, by @Abbe98 and @elebitzero)
  • Extensions can now register commands which can respond to HTTP HEAD requests (#6097, by @Abbe98)
  • The get-all-preferences-command now respond to HTTP GET requests. HTTP POST requests are still supported but extensions and clients are encouraged to migrate (#5850, by @Abbe98)
  • The create-project-from-upload command can now be used to set a project description and creator (#5739, by @Abbe98)
  • It is now optional for action areas to implement a resize function (#5598, by @Abbe98)
  • Project tags are expose in a data attribute (#5590, by @Abbe98)
  • Commands can now support the HTTP HEAD request type (#6097, by @Abbe98)
  • SVG images are supported in Butterfly (simile-butterfly#90, by @Abbe98)

Don't miss a new OpenRefine release

NewReleases is sending notifications on new releases.