* Add debug/trace logging to extract_items
* Handle invalid timestamps for livestreams extraction
* Make use of author_fallback in playlist extractor
* Don't use extract_text for video length extraction
The extract_text function attempts to extract from both the simpleText and
the runs route. This is typically what we'd want for text extraction as
it could appear in both locations. However, while this still holds true,
the thumbnailOverlayTimeStatusRenderer writes a numerical length (when
present on the video) to the simpleText route and uses runs for a
text overlay like "LIVE" or "PREMIERE".
Therefore, when a video has a text overlay instead of a numerical one,
Invidious still passes it onto decode_length_seconds, which obviously
raises since it cannot be converted into integers.
In the future, if more routes requires one text route over the other, we
should go ahead and add an argument to extract_text itself. Though for
now, this is sufficient.
* Handle unsupported "special" categories
- Add extract_text to simplify extraction of InnerTube texts
- Add helper extractor methods to reduce repetition in parsing InnerTube
- Change [] more than 2 blocks long to use #dig or #dig?
- Remove useless ?.try blocks for items that always exists
- Add (some) documentation to VideoRendererParser
This is a squash of a bunch of commits
cherry-picked commits
Fix category parse error on search
(cherry picked from commit cc02fed4e69f0eb5f19e017173632b3a3f20519f)
Fix category items not being extracted in search
(cherry picked from commit 2605b9c609ff217b5a6ae09d22450596dcad90fc)
Make search not include category items for now
(cherry picked from commit ca4afd59f46b595e3c339f31432cad98a5771ee1)
Change behavior of categories in search results
(cherry picked from commit cc1067561051b1c113b490e79c4a71cd346f7b3f)
Fix missing search results in extraction
(cherry picked from commit abda6840d5bfe58f845128bdd1a3f4916dd3bb84)
Fix miscount of search results
(cherry picked from commit 491e33450eb1300d0234bb33df0d0e78a027114f)
This commit adds a new parser for YT's shelfRenderers which are
typically used to denote different categories.The code for featured
channels parsing has also been moved to use the new parser but some
additional refactoring are needed there.
The ContinuationExtractor has also been improved and is now capable of
extraction continuation data that is packaged under
"appendContinuationItemsAction"
In additional this commit adds some useful helper functions to extract
the current selected tab the continuation token. This is to mainly
reduce code size and repetition.
--
This cherry-picked commit also removes the code for parsing featured
channels present on the original.
(cherry picked from commit 8000d538dbbf1eb9c78e000b1449926ba3b24da9)
This commit completely rewrites the extract_item and extract_items
function. Before this commit these two function were an unreadable
mess. The extract_item function was a lengthy if-elsif chain
while the extract_items function contained an incomprehensible
mess of .try, else and ||.
With this commit both of these functions have been pulled into a
separate file with the internal logic being moved to a few classes.
This significantly reduces the size of these two methods, enhances
readability and makes adding new extraction/parse rules much simpler.
See diff for details.
--
This cherry-picked commit also removes the code for parsing featured
channels present on the original.
(cherry picked from commit a027fbf7af1f96dc26fe5a610525ae52bcc40c28)
Upstream requires at least two additional sources. Whereas Invidious needs it to be
able to display a single additional source for normal (dashless)
qualites. Aka medium and hd720.
Video mimetype may contain code information between double quotes.
If not properly escaped, it breaks the browser's parser. E.g:
```
type="video/mp4; codecs=" avc1.64001f,="" mp4a.40.2""=""
```
Thank Robin for catching this!
* Extract feed routes from invidious.cr
* Removes the deprecated route for /feed/top
* Deprecate /view_all_playlist & use /feed/playlists
* Move feed views into their own directory
* Add haltf method to halt current route context
* Change status_code + return blocks to use haltf
* Set appropriate response headers for RSS routes
- Auth (excluding notifications*) APIs
- Mixes
*Notifications currently require the "connection_channel" channel
for talking with the notifications job. Unfortunately, we cannot
access that within the route modules yet.
* use the new youtube api for comments
* remove PG_DB & action parameter + allow force region
* support new comments data with onResponseReceivedEndpoints
* Extract primary channel routes from invidious.cr
Also removes timedtext_video stub since all it does is redirect to the
homepage. However, Invidious's 404 handler already does this.
--
As the template for the channel about page doesn't exist yet, the
behavior for the /channel/:ucid/about endpoint has been changed to be
the same as what's currently present on Invidious
(cherry picked from commit 8fad19d8057d7d22e3de27ebbc88a9978c1df27b)
* Manually extract brand_redirect from 1b569bbc99207cae7c20aa285f42477ae361dd30
This commit manually extracts the brand_redirect function from the
commit mentioned.
However, the redirect to the `.../about` endpoint is removed due to the
fact that it doesn't exist yet.
This commit is also mainly just a bridge for the next few cherry picks from
\#2215
* Update brand_redirect to use youtubei resolve_url
(cherry picked from commit 53335fe7cfdfac392365b7cac447bc7cc6478134)
* Add additional channel endpoints to brand_redirect
(cherry picked from commit 8fc6f3add637dabb09b2034f4d82fc3d039ba15c)
* Add separate handler for /profile endpoint
* Add /channel/:ucid/home route
* Document all channel brand_urls
* Move Crystal stdlib classes overrides to a separate file
* Document known crystal overrides
* Update crystal overrides for HTTP::Client socket
* Update shard.yml to restrict crystal versions
* Fix compilation error in Crystal 1.1.x (See
https://github.com/crystal-lang/crystal/issues/10965
for more details about this issue).
The private `_post_json` method of the YoutubeAPI requires a ClientConfig
as the third parameter. This was passed in all Youtube API methods except the
`#resolve_url` method.
* Put youtube API functions under the YoutubeAPI namespace
* Implement the following endpoints:
- `next`
- `player`
- `resolve_url`
* Allow a ClientConfig to be passed to YoutubeAPI endpoint handlers.
* Add constants for many new clients
* Fix documentation of YoutubeAPI.browse(): Comments and search
result aren't returned by the browse() endpoint but by the next()
and search() endpoints, respectively.
* Accept gzip compressed data, to help save on bandwidth
* Add debug/trace logging
* Other minor fixes
Fixes:
* Sanitize user-provided content in HTML (Fixes#2193)
* Fix encoding of search query in prev/next pages (Fixes#2229)
* Fix some issues introduced with #2196:
- Fix alignment of all <h3> elements (Move the inline style from the parent to the <h3> element)
- Add missing comma on 'dir' HTML attribute (Typo introduced by PR #2196)
Code cleaning:
* Remove unnecessary 'each_sclice' + 'each' double loop in ECR files
* Clean the player's <source> list generation code (in player.ecr)
Related to #1416, it doesn't really fix the real error, but instead mutes the exception message.
Like explained in #1416, this "exception Error" while flushing the client data doesn't harm the client-server connection. However, this exception message continuously spams the logs and makes debugging and error finding really difficult.
Cherry picked from ui overhaul branch with a few modifications:
- channel folder is renamed to channels
- parsing for channel home and featured channels are removed due to
lack of infrastructure from other commits
(cherry picked from commit 44d18b8e147b47ad06a54cc6fd08423d9f39074d)
i was injecting custom css into the site that made the avatars round, and noticed comment avatars looked a little odd
i opened dev tools and siffed through the html, and noticed that the image was being padded,
when it would look nicer if the element used margin instead of padding
with padding:
https://imgur.com/c0pB37e
with proposed changes (margin instead of padding):
https://imgur.com/iKmBzEi
The behavior was as follow: on Right-To-Left text (e.g Arabic) that is wrapped
(because it's too long to fit on one line), the second row and following rows
may or may not be right aligned (as RTL text should be). Opening the devtools
fixes that alignement, as consistently as closing the devtool breaks it.
This problem seems to arrive only in the following configurations (link nested
in a paragraph, both of which may or may not have the dir= attribute):
* `<p><a href="some_link">RTL_TEXT</a></p>`
* `<p><a href="some_link" dir="auto">RTL_TEXT</a></p>`
* `<p dir="auto"><a href="some_link">RTL_TEXT</a></p>`
with the following CSS:
```
p {
unicode-bidi: plaintext;
text-align: start;
}
```
Changing the HTML to the following configuration (a paragraph with the dir=
attribute, nested in a link) seems to fix it:
`<a href="some_link"><p dir="auto">RTL_TEXT</p></a>`
This will prevent, on large pages, the LTR and RTL text to be
far away, on each side of the page. This could happen on channel
and playlists descriptions, when the page is displayed on a large
screen.
* Remove percent-encoding of the search query when calling youtube API, as it
breaks UTF-8
* Empty search redirects to /search, not /
* Show the fullscreen search "home page" (from #1977) at /search
* Allow 'region=' parameter to be passed to /search
* Other minor fixes
Add documentation
Bump web client version string
Add charset=UTF-8 to the 'content-type' header
Parse JSON and return it as a Hash
Handle API error messages
Simple routes have been moved into a single `Misc` file.
Embed routes have been moved into a single `Embed` file.
The preferences route has been renamed to be more consistent with other parts
of the codebase.
The config file can now be specified with `INVIDIOUS_CONFIG_FILE`.
A YAML formatted string can still be passed with `INVIDIOUS_CONFIG`, replacing
the config file.
Additionally all options can now be specified as environment variables.
The syntax for variable names is `INVIDIOUS_` followed by the option name in
upper case. The values are parsed as YAML.
These new env vars only update the provided main configuration, but it is
possible to point the config file at the example config and then use env vars
for all config options:
```
INVIDIOUS_CONFIG_FILE=./config/config.example.yml \
INVIDIOUS_CHANNEL_THREADS=10 \
./invidious
```
The default log level has been changed from `debug` to `info`.
The `debug` log level is now more verbose. `debug` now gives a general overview
of what is happening (where implemented) while `trace` gives all available
details.
The crystal http client maintains a keepalive connection to the other
server which stays alive for some time. This should be closed if the
client instance is not used again to avoid hogging resources
This is similar to the removed `top-enabled` option but for the Popular feed.
The instance needs to be restarted if the feed was enabled.
Editing admin options on the preferences page is also fixed.
The handling of the feed pages now only happens in a single place.
Instead of redirecting:
- The Top feed now displays a message that it was removed from Invidious.
- The Popular feed now displays a message that it was disabled if it was.
Traces can be enabled with `-l trace`.
The problem with subscriptions is that sometimes requests to YouTube never
finish. As soon as that happens `channel-threads` times subscriptions stop
being refreshed. This is most likely a problem with the lsquick bindings.
Everything that gets logged now has a log level associated with it.
The log level can be set with the new `-l` or `--log-level` arguments.
The defaul log level is `debug` for now. There aren't many things that get
logged but if the logs get spammed in the future it can be set down to `info`.
The Top feed used to be a feed based on YouTube ratings. Once YouTube removed
publicly available ratings the Top feed was removed from Invidious but the
option to display a link to it remained.
Besides `auto`, `best` and `worst` it is now possible to select a target height.
If the target height is not available the closest lower height is selected.
* Update the cryptocurrency address with newly created one
* Replace the icon used for the donation address and link
* Replace the word Monero with the word XMR
* Replace the Liberapay placeholder with a link to the documentation
The YouTube headers are now always added for requests to YouTube.
Previously they were only added for requests going through QUIC.
The session token is now JSON decoded to unescape escaped Unicode characters.
The comment continuation protobuf has been updated and the request now goes
through the YouTube `pbj` JSON API.
Redirect channels may use JS to redirect now, instead of only a response header
as it used to be. This fix reads the channel to redirect to from `ytInitialData`.
The `ytInitialPlayerResponse` regex can now handle `var` and `window`
assignments.
The video streams can now be extracted from `player_response` and
`initial_data`.
This fixes the descriptions on videos and videos themselves. Videos are
technically broken right now, but work becasue of a fallback that goes through
embeds.
Electric Boogaloo
The long backtrace has been moved into a `<details>` HTML element, as suggested
by @B0pol. To make the error still visible it has been added to the top under
`Title:`. This also encourages informative issue titles.
Error handling has been reworked to always go through the new `error_template`,
`error_json` and `error_atom` macros.
They all accept a status code followed by a string message or an exception
object. `error_json` accepts a hash with additional fields as third argument.
If the second argument is an exception a backtrace will be printed, if it is a
string only the string is printed. Since up till now only the exception message
was printed a new `InfoException` class was added for situations where no
backtrace is intended but a string cannot be used.
`error_template` with a string message automatically localizes the message.
Missing error translations have been collected in https://github.com/iv-org/invidious/issues/1497
`error_json` with a string message does not localize the message. This is the
same as previous behavior. If translations are desired for `error_json` they
can be added easily but those error messages have not been collected yet.
Uncaught exceptions previously only printed a generic message ("Looks like
you've found a bug in Invidious. [...]"). They still print that message
but now also include a backtrace.
Now that themes are controlled with a class instead of setting
media="none" on the stylesheet link and both themes already being
duplicated in default.css for the automatic themeing it makes sense
to have all theme related CSS in the same place.
This commit also fixes the missing dark theme on embeds.
Themes are now controlled with a class on the body element.
If a preference is set the body element will have either "dark-theme"
or "light-theme" class. If no preference is set or the preference is
empty the class will be "no-theme".
"dark-theme" and "light-theme" are handled by darktheme.css and
lighttheme.css respectively.
"no-theme" is handled by default.css where depending on the value of
"prefers-color-scheme" the styles corresponding to "dark-theme" or
"light-theme" are applied.
Unfortunately this means that both themes are duplicated, once in the
theme .css and once in default.css.
The index was set to index - 1, causing the first video to be shifted in fetch_playlist_videos
(because of its index being -1 lower than it should) and thus not displayed on playlist page.
Using the player on latest Safari, the tooltip appears and stays stuck for long even when switching to fullscreen which is annoying. You need to explicitly click anywhere to dismiss that stuck tooltip.
This doesn't seem to happen in Firefox so I am not sure whether this is a browser bug, but in any case I don't see any value in keeping this tooltip so maybe we can just remove it?
In practice with the patch I usually see backoff to 2 hours when blocked, so it should improve recovery time. The lim_thread is to work with multi-threading, not sure if it's the best way to do it.
* Use new API to fetch videos from channels
This mirrors the process used by subscriptions.gir.st. The old API is
tried first, and if it fails then the new one is used.
* Use the new API whenever getting videos from a channel
I created the get_channel_videos_response function because now instead
of just getting a single url, there are extra steps involved in getting
the API response for channel videos, and these steps don't need to be
repeated throughout the code.
The only remaining exception is the bypass_captcha function, which still
only makes a request to the old API. I don't know whether this code
needs to be updated to use the new API for captcha bypassing to work
correctly.
* Correctly determine video length with new api
* Remove unnecessary line
* More consistent IDs for info section
More consistent IDs for info section: watch-on-youtube, annotations and download
* Consistent IDs: channel-name
* Consistent IDs: published-date
The term "published" can also be found in the answer for the following YouTube API request: https://developers.google.com/youtube/v3/docs/videos/list
* Verify token signature in constant time
To prevent timing side channel attacks
* Run cheap checks first in token validation process
Expensive checks such as the nonce lookup on the database or the
signature check can be run after cheap/fast checks.