Commit graph

37 commits

Author SHA1 Message Date
tobi f45c3b3708 bit more hacking away at it 2024-12-24 18:05:20 +01:00
kim 87cff71af9
[feature] persist worker queues to db (#3042)
* persist queued worker tasks to database on shutdown, fill worker queues from database on startup

* ensure the tasks are sorted by creation time before pushing them

* add migration to insert WorkerTask{} into database, add test for worker task persistence

* add test for recovering worker queues from database

* quick tweak

* whoops we ended up with double cleaner job scheduling

* insert each task separately, because bun is throwing some reflection error??

* add specific checking of cancelled worker contexts

* add http request signing to deliveries recovered from database

* add test for outgoing public key ID being correctly set on delivery

* replace select with Queue.PopCtx()

* get rid of loop now we don't use it

* remove field now we don't use it

* ensure that signing func is set

* header values weren't being copied over 🤦

* use ptr for httpclient.Request in delivery

* move worker queue filling to later in server init process

* fix rebase issues

* make logging less shouty

* use slices.Delete() instead of copying / reslicing

* have database return tasks in ascending order instead of sorting them

* add a 1 minute timeout to persisting worker queues
2024-07-30 13:58:31 +02:00
kim cde2fb6244
[feature] support processing of (many) more media types (#3090)
* initial work replacing our media decoding / encoding pipeline with ffprobe + ffmpeg

* specify the video codec to use when generating static image from emoji

* update go-storage library (fixes incompatibility after updating go-iotools)

* maintain image aspect ratio when generating a thumbnail for it

* update readme to show go-ffmpreg

* fix a bunch of media tests, move filesize checking to callers of media manager for more flexibility

* remove extra debug from error message

* fix up incorrect function signatures

* update PutFile to just use regular file copy, as changes are file is on separate partition

* fix remaining tests, remove some unneeded tests now we're working with ffmpeg/ffprobe

* update more tests, add more code comments

* add utilities to generate processed emoji / media outputs

* fix remaining tests

* add test for opus media file, add license header to utility cmds

* limit the number of concurrently available ffmpeg / ffprobe instances

* reduce number of instances

* further reduce number of instances

* fix envparsing test with configuration variables

* update docs and configuration with new media-{local,remote}-max-size variables
2024-07-12 09:39:47 +00:00
kim a483bd9e38
[performance] massively improved ActivityPub delivery worker efficiency (#2812)
* add delivery worker type that pulls from queue to httpclient package

* finish up some code commenting, bodge a vendored activity library change, integrate the deliverypool changes into transportcontroller

* hook up queue deletion logic

* support deleting queued http requests by target ID

* don't index APRequest by hostname in the queue

* use gorun

* use the original context's values when wrapping msg type as delivery{}

* actually log in the AP delivery worker ...

* add uncommitted changes

* use errors.AsV2()

* use errorsv2.AsV2()

* finish adding some code comments, add bad host handling to delivery workers

* slightly tweak deliveryworkerpool API, use advanced sender multiplier

* remove PopCtx() method, let others instead rely on Wait()

* shuffle things around to move delivery stuff into transport/ subpkg

* remove dead code

* formatting

* validate request before queueing for delivery

* finish adding code comments, fix up backoff code

* finish adding more code comments

* clamp minimum no. senders to 1

* add start/stop logging to delivery worker, some slight changes

* remove double logging

* use worker ptrs

* expose the embedded log fields in httpclient.Request{}

* ensure request context values are preserved when updating ctx

* add delivery worker tests

* fix linter issues

* ensure delivery worker gets inited in testrig

* fix tests to delivering messages to check worker delivery queue

* update error type to use ptr instead of value receiver

* fix test calling Workers{}.Start() instead of testrig.StartWorkers()

* update docs for advanced-sender-multiplier

* update to the latest activity library version

* add comment about not using httptest.Server{}
2024-04-11 11:45:35 +02:00
kim d61d5c8a6a
[bugfix] httpclient not signing subsequent redirect requests (#2798)
* move http request signing to transport

* actually hook up the http roundtripper ...

* add code comments for the new gtscontext functions
2024-04-02 13:12:26 +02:00
kim 1d51e3c8d6
[bugfix] 2643 bug search for account url doesnt always work when redirected (#2673)
* update activity library so dereferencer returns full response and checks *final* link to allow for redirects

* temporarily add bodged fixed library

* remove unused code

* update getAccountFeatured() to use dereferenceCollectionPage()

* make sure to release map

* perform a 2nd decode to ensure reader is empty after primary decode

* add comment explaining choice of using Decode() instead of Unmarshal()

* update embedded activity library to latest matching https://github.com/superseriousbusiness/activity/pull/21

* add checks to look for changed URI and re-check database if redirected

* update max iteration count to 512, add checks during dereferenceAncestors() for indirect URLs

* remove doubled-up code

* fix use of status instead of current

* use URIs for checking equality for security

* use the latest known URI for boost_of_uri in case original was an indirect

* add dereferenceCollection() function for dereferenceAccountFeatured()

* pull in latest github.com/superseriousbusiness/activity version (and remove the bodge!!)

* fix typo in code comments

* update decodeType() to accept a readcloser and handle body closing

* switch to checking using BoostOfID and add note why not using BoostOfURI

* ensure InReplyTo gets unset when deleting status parent in case currently stubbed

* add tests for Collection and CollectionPage iterators
2024-02-23 16:24:40 +01:00
Milas Bowman af1a26a68f
[feature] Add Mastodon-compatible HTTP signature fallback (#2659)
On outgoing `GET` requests that are signed (e.g. authorized fetch),
if the initial request fails with `401`, try again, but _without_
the query parameters included in the HTTP signature.

This is primarily useful for compatibility with Mastodon; though
hopefully this can be removed in the not-too-distant future, as
they've started changing their behavior here.

Signed-off-by: Milas Bowman <devnull@milas.dev>
2024-02-19 11:18:17 +01:00
tobi b614d33c40
[feature] Try HTTP signature validation with and without query params for incoming requests (#2591)
* [feature] Verify signatures both with + without query params

* Bump to tagged version
2024-01-31 14:15:28 +00:00
tobi 5eddef6c9b
[feature] Add /api/v1/admin/debug/apurl endpoint (#2359) 2023-11-27 14:02:52 +00:00
tobi 24fbdf2b0a
[chore] Refactor AP authentication, other small bits of tidying up (#1874) 2023-06-13 15:47:56 +01:00
kim 6a29c5ffd4
[performance] improved request batching (removes need for queueing) (#1687)
* revamp http client to not limit requests, instead use sender worker

Signed-off-by: kim <grufwub@gmail.com>

* remove separate sender worker pool, spawn 2*GOMAXPROCS batch senders each time, no need for transport cache sweeping

Signed-off-by: kim <grufwub@gmail.com>

* improve batch senders to keep popping recipients until remote URL found

Signed-off-by: kim <grufwub@gmail.com>

* fix recipient looping issue

Signed-off-by: kim <grufwub@gmail.com>

* fix missing mutex unlock

Signed-off-by: kim <grufwub@gmail.com>

* move request id ctx key to gtscontext, finish filling out more code comments, add basic support for not logging client IP

Signed-off-by: kim <grufwub@gmail.com>

* slight code reformatting

Signed-off-by: kim <grufwub@gmail.com>

* a whitespace

Signed-off-by: kim <grufwub@gmail.com>

* remove unused code

Signed-off-by: kim <grufwub@gmail.com>

* add missing license headers

Signed-off-by: kim <grufwub@gmail.com>

* fix request backoff calculation

Signed-off-by: kim <grufwub@gmail.com>

---------

Signed-off-by: kim <grufwub@gmail.com>
2023-04-28 17:45:21 +02:00
Daenney 5e2bf0bdca
[chore] Improve copyright header handling (#1608)
* [chore] Remove years from all license headers

Years or year ranges aren't required in license headers. Many projects
have removed them in recent years and it avoids a bit of yearly toil.

In many cases our copyright claim was also a bit dodgy since we added
the 2021-2023 header to files created after 2021 but you can't claim
copyright into the past that way.

* [chore] Add license header check

This ensures a license header is always added to any new file. This
avoids maintainers/reviewers needing to remember to check for and ask
for it in case a contribution doesn't include it.

* [chore] Add missing license headers

* [chore] Further updates to license header

* Use the more common // indentend comment format
* Remove the hack we had for the linter now that we use the // format
* Add SPDX license identifier
2023-03-12 16:00:57 +01:00
kim d8d5818b47
[bugfix] internal server error on search not found (#1590)
* add error value wrapping, include status code / not found flags from transport errors, update error usages

Signed-off-by: kim <grufwub@gmail.com>

* add code commenting for gtserror functions

Signed-off-by: kim <grufwub@gmail.com>

---------

Signed-off-by: kim <grufwub@gmail.com>
2023-03-06 10:38:43 +01:00
kim a684fc4628
[chore] transport improvements (#1524)
* improve error readability, mark "bad hosts" as fastFail

Signed-off-by: kim <grufwub@gmail.com>

* pull in latest go-byteutil version with byteutil.Reader{}

Signed-off-by: kim <grufwub@gmail.com>

* use rewindable body reader for post requests

Signed-off-by: kim <grufwub@gmail.com>

---------

Signed-off-by: kim <grufwub@gmail.com>
2023-02-18 17:02:19 +01:00
Daenney 68e6d08c76
[feature] Add a request ID and include it in logs (#1476)
This adds a lightweight form of tracing to GTS. Each incoming request is
assigned a Request ID which we then pass on and log in all our log
lines. Any function that gets called downstream from an HTTP handler
should now emit a requestID=value pair whenever it logs something.

Co-authored-by: kim <grufwub@gmail.com>
2023-02-17 12:02:29 +01:00
kim 70739d32cc
[performance] remove throttling timers (#1466)
* remove throttling timers, support setting retry-after, use retry-after in transport

* remove unused variables

* add throttling-retry-after to cmd flags

* update envparsing to include new throttling-retry-after

* update example config to include retry-after documentation

* also support retry-after formatted as date-time, ensure max backoff time

---------

Signed-off-by: kim <grufwub@gmail.com>
2023-02-10 20:16:01 +00:00
tobi 0dbe6c514f
[chore] Update/add license headers for 2023 (#1304) 2023-01-05 12:43:00 +01:00
Daniele Sluijters c5ae88c51b
[chore] Set User-Agent header in transport (#1154)
Currently requests set their own User-Agent. This moves it down to set
it in the transport's do() method, to guarantee it's always set on all
requests.
2022-11-26 20:19:42 +00:00
tobi c9d893fec1
[feature/performance] Fail fast when doing remote transport calls inside incoming request contexts (#1119)
* [feature/performance] Fail fast when doing remote transport calls inside incoming request contexts

* [chore] Reduce outgoing request timeout to 15s

* log error messages when fastfailing

* use context.Value() instead of wrapped context, wrap error with fastfail instead of extra log entry

* add fast-fail context key test

Signed-off-by: kim <grufwub@gmail.com>
Co-authored-by: kim <grufwub@gmail.com>
2022-11-23 21:40:07 +00:00
kim e58a6a2da3
[performance] cache domains after max retries in transport (#884) 2022-10-08 13:50:16 +02:00
kim 1d999712e6
[feature] update config types to use bytesize.Size (#828)
* update config size types to use bytesize.Size

* submit unchecked-out file ... 🤦

* fix bytesize config var decoding

* bump bytesize version

* update kim's libraries in readme

* update envparse.sh to output more useful errors

* improve envparse.sh

* remove reliance on jq

* instead, use uint64 for bytesize flag types

* remove redundant type

* fix viper unmarshaling

* Update envparsing.sh

* fix envparsing test

Signed-off-by: kim <grufwub@gmail.com>
Co-authored-by: tobi <31960611+tsmethurst@users.noreply.github.com>
2022-09-29 21:50:43 +01:00
kim 098dbe6ff4
[chore] use our own logging implementation (#716)
* first commit

Signed-off-by: kim <grufwub@gmail.com>

* replace logging with our own log library

Signed-off-by: kim <grufwub@gmail.com>

* fix imports

Signed-off-by: kim <grufwub@gmail.com>

* fix log imports

Signed-off-by: kim <grufwub@gmail.com>

* add license text

Signed-off-by: kim <grufwub@gmail.com>

* fix package import cycle between config and log package

Signed-off-by: kim <grufwub@gmail.com>

* fix empty kv.Fields{} being passed to WithFields()

Signed-off-by: kim <grufwub@gmail.com>

* fix uses of log.WithFields() with whitespace issues and empty slices

Signed-off-by: kim <grufwub@gmail.com>

* *linter related grumbling*

Signed-off-by: kim <grufwub@gmail.com>

* gofmt the codebase! also fix more log.WithFields() formatting issues

Signed-off-by: kim <grufwub@gmail.com>

* update testrig code to match new changes

Signed-off-by: kim <grufwub@gmail.com>

* fix error wrapping in non fmt.Errorf function

Signed-off-by: kim <grufwub@gmail.com>

* add benchmarking of log.Caller() vs non-cached

Signed-off-by: kim <grufwub@gmail.com>

* fix syslog tests, add standard build tags to test runner to ensure consistency

Signed-off-by: kim <grufwub@gmail.com>

* make syslog tests more robust

Signed-off-by: kim <grufwub@gmail.com>

* fix caller depth arithmatic (is that how you spell it?)

Signed-off-by: kim <grufwub@gmail.com>

* update to use unkeyed fields in kv.Field{} instances

Signed-off-by: kim <grufwub@gmail.com>

* update go-kv library

Signed-off-by: kim <grufwub@gmail.com>

* update libraries list

Signed-off-by: kim <grufwub@gmail.com>

* fuck you linter get nerfed

Signed-off-by: kim <grufwub@gmail.com>

Co-authored-by: tobi <31960611+tsmethurst@users.noreply.github.com>
2022-07-19 10:47:55 +02:00
tobi 1cdc163276
[performance] Don't retry/backoff invalid http requests that will never succeed (#609)
* add httpguts (ew)

* add ValidateRequest err wrapping logic

* don't retry on unrecoverable errors

* i am very clever
2022-05-26 13:38:41 +02:00
kim 223025fc27
[security] transport.Controller{} and transport.Transport{} security and performance improvements (#564)
* cache transports in controller by privkey-generated pubkey, add retry logic to transport requests

Signed-off-by: kim <grufwub@gmail.com>

* update code comments, defer mutex unlocks

Signed-off-by: kim <grufwub@gmail.com>

* add count to 'performing request' log message

Signed-off-by: kim <grufwub@gmail.com>

* reduce repeated conversions of same url.URL object

Signed-off-by: kim <grufwub@gmail.com>

* move worker.Worker to concurrency subpackage, add WorkQueue type, limit transport http client use by WorkQueue

Signed-off-by: kim <grufwub@gmail.com>

* fix security advisories regarding max outgoing conns, max rsp body size

- implemented by a new httpclient.Client{} that wraps an underlying
  client with a queue to limit connections, and limit reader wrapping
  a response body with a configured maximum size
- update pub.HttpClient args passed around to be this new httpclient.Client{}

Signed-off-by: kim <grufwub@gmail.com>

* add httpclient tests, move ip validation to separate package + change mechanism

Signed-off-by: kim <grufwub@gmail.com>

* fix merge conflicts

Signed-off-by: kim <grufwub@gmail.com>

* use singular mutex in transport rather than separate signer mus

Signed-off-by: kim <grufwub@gmail.com>

* improved useragent string

Signed-off-by: kim <grufwub@gmail.com>

* add note regarding missing test

Signed-off-by: kim <grufwub@gmail.com>

* remove useragent field from transport (instead store in controller)

Signed-off-by: kim <grufwub@gmail.com>

* shutup linter

Signed-off-by: kim <grufwub@gmail.com>

* reset other signing headers on each loop iteration

Signed-off-by: kim <grufwub@gmail.com>

* respect request ctx during retry-backoff sleep period

Signed-off-by: kim <grufwub@gmail.com>

* use external pkg with docs explaining performance "hack"

Signed-off-by: kim <grufwub@gmail.com>

* use http package constants instead of string method literals

Signed-off-by: kim <grufwub@gmail.com>

* add license file headers

Signed-off-by: kim <grufwub@gmail.com>

* update code comment to match new func names

Signed-off-by: kim <grufwub@gmail.com>

* updates to user-agent string

Signed-off-by: kim <grufwub@gmail.com>

* update signed testrig models to fit with new transport logic (instead uses separate signer now)

Signed-off-by: kim <grufwub@gmail.com>

* fuck you linter

Signed-off-by: kim <grufwub@gmail.com>
2022-05-15 11:16:43 +02:00
tobi e63b653199
[performance] Add dereference shortcuts to avoid making http calls to self (#430)
* update transport (controller) to allow shortcuts

* go fmt

* expose underlying sig transport to allow test sigs
2022-03-15 15:01:19 +01:00
tsmethurst c157b1b20b rework data function to provide filesize 2022-01-23 14:41:58 +01:00
tsmethurst 589bb9df02 pass reader around instead of []byte 2022-01-16 18:52:55 +01:00
tsmethurst 113f9d9ab4 pass a function into the manager, start work on emoji 2022-01-11 17:49:14 +01:00
tsmethurst f61c3ddcf7 compiling now 2022-01-08 17:17:01 +01:00
tobi ef5a9256a8
Extend license notices to 2022 (#354) 2021-12-20 18:42:19 +01:00
tobi 09ef9e639e
move to ssb gofed fork (#298) 2021-11-13 17:29:43 +01:00
R. Aidan Campbell 083099a957
reference global logrus (#274)
* reference logrus' global logger instead of passing and storing a logger reference everywhere

* always directly use global logrus logger instead of referencing an instance

* test suites should also directly use the global logrus logger

* rename gin logging function to clarify that it's middleware

* correct comments which erroneously referenced removed logger parameter

* setting log level for tests now uses logrus' exported type instead of the string value, to guarantee error isn't possible
2021-10-11 14:37:33 +02:00
tobi 2dc9fc1626
Pg to bun (#148)
* start moving to bun

* changing more stuff

* more

* and yet more

* tests passing

* seems stable now

* more big changes

* small fix

* little fixes
2021-08-25 15:34:33 +02:00
Tobi Smethurst 87cf621e21
Remote instance dereferencing (#70)
Remote instances are now dereferenced when they post to an inbox on a GtS instance.

    Dereferencing will be done first by checking the /api/v1/instance endpoint of an instance.
    If that doesn't work, /.well-known/nodeinfo will be checked.
    If that doesn't work, only a minimal representation of the instance will be stored.

A new field was added to the Instance database model. To create it:

alter table instances add column contact_account_username text;
2021-06-27 16:52:18 +02:00
tsmethurst 0fe853b1ee first implementation of search feature 2021-05-29 19:36:54 +02:00
Tobi Smethurst d839f27c30
Follows and relationships (#27)
* Follows -- create and undo, both remote and local
* Statuses -- federate new posts, including media, attachments, CWs and image descriptions.
2021-05-21 15:48:26 +02:00
Tobi Smethurst 6cd033449f
Refine statuses (#26)
Remote media is now dereferenced and attached properly to incoming federated statuses.
    Mentions are now dereferenced and attached properly to incoming federated statuses.
    Small fixes to status visibility.
    Allow URL params for filtering statuses:

	// ExcludeRepliesKey is for specifying whether to exclude replies in a list of returned statuses by an account.
      	// PinnedKey is for specifying whether to include pinned statuses in a list of returned statuses by an account.
      	// MaxIDKey is for specifying the maximum ID of the status to retrieve.
      	// MediaOnlyKey is for specifying that only statuses with media should be returned in a list of returned statuses by an account.

    Add endpoint for fetching an account's statuses.
2021-05-17 19:06:58 +02:00