hydrusnetwork/hydrus v640 on GitHub

Important

v636 was the last release with a macOS App. Please check that release for information on how to convert a macOS App install to a source install!

new navigation features

thanks to a user, we have some neat new UI tech--
in a normal 'previous/next' media viewer, there is now a 'show random' button in the top-right hover. this jumps to a random position in the list. you can right-click this button to walk back, too! the 'media navigation:random/undo random' shortcut actions are settable under the 'media viewer - normal browser' shortcut set. note this is true random, not shuffle
the Main GUI's pages menu now has a 'history' submenu that shows which pages you were last navigated to! if you have a giganto session, see how it feels to work with. I think I'd like to have some page navigation shortcuts tied to this
a new shortcut action, focus the tab the media came from, if possible, and focus the media, which appears in the 'all' media viewes, 'normal browser', and 'media viewers' shortcut sets, now lets you focus the spawning page of this media viewer and the media you are currently looking at. this is in complement to recent 'show page/media' settings recently added on media viewer close for users who regularly use multiple simultaneous media viewers; this does the same, but it leaves the media viewer open and does not switch focus away. in a secret feature, right-clicking the 'drag media' button triggers this command

duplicates

added a 'auto-commit completed batches of this size or smaller' setting to options->duplicates, for the filter. if you finish the current batch without any manual skips, and the number of actions you made is equal to or less than this, it'll just confirm and load the next batch. the default value here is 1--let's see if that makes going through 1/1 batches in group mode a little nicer
'show some random potential pairs' is now an asynchronous job. it won't block the UI any more. while it is working, the button will be disabled
after last week's 'potential duplicates discovery search' overhaul did not bring the house down, I have made it so any new file import will wake the daemon instantly if it can A) work now, and B) there are fewer than 50 files remaining in the search queue. thus, if you are synced on potential dupe discovery, you are going to see new imports searched for potentials and then actioned by auto-resolution rules within moments. again, let's see how this feels IRL. it feels like we need better discoverability of when files are deleted, but I'm of two minds about how to do it
the visual duplicates detector is slightly better at determining RGB hue-shifts as alternates

duplicates auto-resolution

the duplicates auto-resolution daemon now has customisable work/rest settings like the other daemons under options->maintenance and processing. this was all hardcoded before
the 'test A or B' comparator's edit panel now has nice OR UI. the autocomplete dropdown responds to shft+enter, has explicit OR/cancel-OR/rewind-OR buttons, and any 'edit OR' sub-dialog will have similarly limited system predicate support
added two new hardcoded comparators: A and B have the same "has exif" value and A and B have the same "has icc profile" value. these match if A and B are both True or both False--useful if you don't want to accidentally promote a 'bare' file over one with extra metadata
added a new 'OR Comparator' type. it holds a list of comparators and returns True if any are True
I have overhauled the suggested rules--
- the A >= 1.1x B blah is now A > 1.0x B in all cases. IRL feedback suggests this padding was neither helpful nor needed
- the visually similar pairs - eliminate smaller resolution and visually similar pairs - eliminate smaller filesize suggested rules are merged into visually similar pairs that tests A > 1.0x B num_pixels OR A > 1.0x B filesize (while still checking A has bigger or equal filesize, width, and height to be careful)
- the pixel-perfect pairs - eliminate bloat suggested rule is renamed to pixel-perfect pairs
- pixel-perfect pairs and visually similar pairs no longer exclude files with exif or icc data either in search or from B--instead they have comparators that say both A and B have the same "has exif/icc profile" value OR B does not have exif/icc profile (i.e. Yes/Yes, Yes/No, No/No, but not No/Yes). users who care deeply about EXIF or ICC Profiles may wish to edit, but this is a reasonably safe compromise that will work for most
if you have already deployed the suggested rules, have a think about if you want to change to the new defaults. if you do, although it is finicky, I recommend editing your rule in-place to reflect the suggested one, and then you'll keep your rule history (to do this, load up the suggested rule, check its new search and update the old rule to look like that, then export/import the comparators via clipboard, then delete the suggested rule again). note of course that if you change file search and comparators, your rules will reset their search and test status, which for the 'visually similar' rules could mean a lot of reset work! I don't think I'll adjust the logic of the suggested rules much more--although I guess I'll drop the 'no transparency' predicate when visual dupes can handle it better--but I do expect to tweak the 'visual duplicates' algorithm further, so I expect to encourage one more beta-tester test reset in the coming months

downloader stuff

fixed an issue with url class matching priority; domains were all being sorted with equal value after the recent URLDomainMask work. the correct behaviour is longer domains are matched first
subscriptions are better about cancelling pending file work. if there are multiple queries with pending file downloads but the system has to stop before they are all done (this happens a lot when the sub is bandwidth choked), the overseer subscription call is new more aware of the stop reason and will skip checking (and loading/saving!!) the remaining queries for their (instantly failing) thoughts
the routine that says 'hey record bandwidth for the original spawning domain if that differs from the file URL's domain' now works on file import objects that create multiple child import objects, such as pixiv multi-file posts. this tech ensures that bandwidth wait logic lines up across domains when a site stores files on an external CDN
when a gallery url gets a 400 response from the server, the result is now 'ignored', with note '400', just like 403/404 handling. previously, this counted as a full error and was registered as a domain network error, which was causing trouble for those sites that give 400 for the overflow gallery page
if the downloader grabs and tries to import an HTML file, the error note is more helpful. also, it catches JSON with the same hook now too

misc

when you delete lots of thumbs at once, the job now works in batches of 16 files (was 64 previously), and a popup with a progress gauge now appears after three seconds
in the manage tags dialog, the 'file lookup' tag suggestion box's link button now shows any 3XX redirected GET URL the script ran across (e.g. if the MD5 gallery lookup was redirected to a Post URL), and you can now choose to open or copy (previously it just did open)
export folders have two new checkboxes--'overwrite all sidecars on next run' and 'always overwrite all sidecars' to help control sidecar regen. some text scares you away from setting 'always do it' on a short period export folder (issue #1801)
the default period for an export folder is now 24 hours (previously 1 hour, which seems a little keen compared to how we ended up generally using these guys)
all the close-page confirmation yes/no dialogs use the grammar 'Close "name"?'. previously they were a patchwork of different language that generally didn't say the name of the page

client api

/manage_file_relationships/get_potential_pairs has a new parameter, group_mode, a bool, optional, default false, that switches to group mode. in this mode, max_num_pairs is ignored; you get the whole thing
/manage_file_relationships/get_potential_pairs has two more new parameters, duplicate_pair_sort_type and duplicate_pair_sort_asc, both optional, defaulting to 'filesize of larger file--largest first', to handle the new pair sort. they are an int enum and a bool
updated the help to talk about these
wrote unit tests for these
the Client API is now version 81

boring stuff

I did the first half of a debug-level testing suite that will programatically tune the visual duplicates system. it eats a bunch of example files, generates various jpeg quality subsampling, and resize duplicates, and also makes some fake alternates with watermarks, artist corrections, and colour swaps. the second half will run these files against each other and profile how the internal variables of visual duplicates respond to the wider and more precisely defined range of differences, allowing us to choose better tuning coefficients and automating what I was previously doing manually fingers crossed, this will improve the confidence of visual duplicates, including across subsampling differences (it is bad at this atm), and make future tweaks or 'now we can handle an alpha channel' tech easier to pull off
updated/added unit tests for client api potential pair searching when: there are no special params; there is a search space; there is a min number of rows set; there is a specific sort set; group mode is on
wrote up unit tests for the new exif, icc profile, and OR auto-resolution comparators
fixed up some imperfect regexes in the unit tests
wrote a widget for editing a list of comparators; the selector and comparator OR panels now use this
broke the duplicate filtering page into nicer panel classes. there's still a bit of Qt Signal mess under the hood, but the preparation and filtering tabs are no longer all mixed into the same place
if a dupe filter page does not find pairs to show in the 'show some random pairs' button, the page state (used mostly in client api reporting atm) is now correctly reset from 'loading' to 'normal'
renamed a bunch of patchwork 'work_time'/'time_it_took' variables in my different daemons to 'actual/expected_work_period'
the mixed duplicate pair factory now takes its 'no more than' value during init, decoupling it from the options

hydrusnetwork/hydrus v640 Version 640 on GitHub

new navigation features

duplicates

duplicates auto-resolution

downloader stuff

misc

client api

boring stuff

hydrusnetwork/hydrus v640
Version 640

on GitHub