github crawlab-team/crawlab v0.6.0-beta.20210803

latest releases: v0.6.3, v0.6.2, v0.6.1...
2 years ago

Change Log (v0.6.0-beta.20210803)

Overview

This is the beta release for the next major version v0.6.0. It recommended NOT to use it in production as it is not fully tested and thus not stable enough. Futhermore, more features including those not ready in the beta release (e.g. Git, Scrapy, Notification) are planned to be integrated into the live version, in the form of plugins.

Enhancement

As a major release, v0.6 (including beta versions) is consisted of a number of large changes to enhance the performance, scalability, robustness and usability of Crawlab. This beta version is theoretically more robust than older versions mainly in task execution, files synchronization and node management, yet we still recommend users to thoroughly run tests with various samples.

Backend

  • File Synchronization. Migrated file sync from MongoDB GridFS to SeaweedFS for better stability and robustness.
  • Node Communication. Migrated node communication from Redis-based RPC to gRPC. Worker nodes indirectly interact with MongoDB by making gRPC calls to the master node.
  • Task Queue. Migrated task queue from Redis list to MongoDB collection to allow more flexibility (e.g. priority queue).
  • Logging. Migrated logging storage system to SeaweedFS to resolve performance issue in MongoDB.
  • SDK Integration. Migrated results data ingestion from native SDK to task handler side.
  • Task Related. Abstracted task related logics into Task Scheduler, Task Handler and Task Runners to increase decoupling and improve scalability and maintainability.
  • Compotenization. Introduced DI (dependency injection) framework and componentized modules, services and sub-systems.

Frontend

  • Vue 3. Migrated to latest version of frontend framework Vue 3 to support more advanced features such as composition API and TypeScript.
  • UI Framework. Built with Vue 3-based UI framework Element-Plus from Vue-Element-Admin, more flexibility and functionality.
  • Advanced File Editor. Support more advanced file editor features including drag-and-drop copying/moving files, renaming, deleting, file editing, code highlight, nav tabs, etc.
  • Customizable Table. Support more advanced built-in operations such as columns adjustment, batch operation, searching, filtering, sorting, etc.
  • Nav Tabs. Support multiple nav tabs for viewing different pages.
  • Batch Creation. Support batch creating objects including spiders, projects, schedules, etc.
  • Detail Navigation. Sidebar navigation in detail pages.
  • Enhanced Dashboard. More stats charts in home page dashboard.

TODOs

As you may be aware that this is a beta release, some of the existing useful features such as Git and Scrapy integration may not be available. However, we are trying to include them in the official v0.6.0 release, as some of their core functionalities are already ready in the code base, and we will add to the stable version only if they are fully tested.

  • Plugin Framework. Advanced features will exist in the form of plugins, or pluggable modules.
  • Git Integration. To be included as a plugin.
  • Scrapy Integration. To be included as a plugin.
  • Notifications. To be included as a plugin.
  • Associated Tasks. There will be main tasks and their sub-tasks if task mode is "all nodes" or "selected nodes".
  • Crontab Editor. Frontend component that visualize the crontab editing.
  • Results Deduplication.
  • Environment Variables.
  • Internationalization. Support Chinese.
  • Frontend Utility Enhancement. Advanced features such as saved table customization.
  • Log Auto Cleanup.
  • Documentation.

What Next

This beta release is only a preview and a test ground for the core functionalies in Crawlab v0.6. Therefore, we will invite you guys to download and run more tests. The official release is expected to be ready after major issues from the beta version are sorted and Plugin Framework and other key features are developed and fully tested. With that beared in mind, a second beta version before the main release will also be possible.

Don't miss a new crawlab release

NewReleases is sending notifications on new releases.