Added
- Dynamically building request URLs from extracted data:
Http
steps now have a newstaticUrl()
method, and you can also use variables within that static URL - as well as in request headers and the body - likehttps://www.example.com/foo/[crwl:some_extracted_property]
. These placeholders will be replaced with the corresponding properties from input data (also works with kept data). - New Refiners:
DateTimeRefiner::reformat('Y-m-d H:i:s')
to reformat a date time string to a different format. Tries to automatically recognize the input format. If this does not work, you can provide an input format to use as the second argument.HtmlRefiner::remove('#foo')
to remove nodes matching the given selector from selected HTML.
- Steps that produce multiple outputs per input can now group them per input by calling the new
Step::oneOutputPerInput()
method.