github jdepoix/youtube-transcript-api v1.0.0

latest release: v1.0.1
one day ago

What's Changed

  • Overhaul of the public API to move away from the static methods get_transcript, get_transcripts and list_transcripts
    • YouTubeTranscriptApi.get_transcript(video_id) is replaced with YouTubeTranscriptsApi().fetch(video_id)
    • YouTubeTranscriptApi.list_transcripts(video_id) is replaced with YouTubeTranscriptsApi().list(video_id)
    • There is no equivalent for YouTubeTranscriptApi.get_transcript in the new interface, as this doesn't provide any meaningful utility over just running [ytt_api.fetch(video_id) for video_id in video_ids]
    • By calling .fetch and .list on a YouTubeTranscript instance, we can share a HTTP session between all requests, which allows us to share cookies and reduces redundant requests, thereby saving bandwidth and proxy costs.
    • transcript.fetch() now returns a FetchedTranscript object instead of a list of dictionaries. This allows for adding metadata and utility methods to the returned object. You can still convert a FetchedTranscript object to the previously used format by calling fetched_transcript.to_raw_data().
    • You'll find more details on the updated API in the README. The old static methods can still be used, but have been deprecated and will be removed in a future version!
  • Added new exceptions types to make the cause of some common errors more clear and allow for catching/handling them
    • RequestBlocked is now raised if the request has been blocked by YouTube due to a blacklisted IP (which would previously raise TranscriptDisabled #303)
    • AgeRestricted is raised if the video is age restricted and requires cookie authentication (#111)
    • VideoUnplayable is raised if the video is unplayable for an unknown reason. When this happens the error message that YouTube would display on the WebPlayer is returned by the exception, which should make unknown errors more useful. (#219)
  • Added type hierarchy to configure proxies, which can now be passed into the constructor of YouTubeTranscriptApi. All proxy configs are located in the new module youtube_transcript_api.proxies.
    • Generic HTTP/HTTPS/SOCKS proxy can be configured using the GenericProxyConfig class (similarly to how it was done before using the requests dict)
    • Added integration of the proxy provider Webshare, which allows for easily setting up rotating residential proxies using the WebshareProxyConfig
    • You'll find more details on the proxy config classes and how to use them in the README
  • Added the option to pass a HTTP session into the YouTubeTranscriptApi constructor
    • Allows for setting a path to CA_BUNDLE file (#362, #312)
    • Allows for setting custom headers (#316)
    • Allows for sharing HTTP sessions between multiple instance of YouTubeTranscriptApi
  • Added type signatures to all interfaces

Contributors

Due to the rewrite of some interfaces I wasn't able to merge their PRs directly, but special thanks to the work done by @crhowell in #219 and by @andre-c-andersen in #337, as their PRs have been very useful in implementing the new exceptions types! 😊🙏

Full Changelog: v0.6.3...v1.0.0

Don't miss a new youtube-transcript-api release

NewReleases is sending notifications on new releases.