It turns out sorting entries by inode slightly improves performance on HDDs and there is a cheap way of getting the inode id for each entry.
This optimization is very local now and sorting spans only a single directory contents. We can do better in the future by keeping unprocessed directories in a priority queue, but that would require a lot more changes to the code.