This syncs our copy with the current state of the ai.robots.txt
repository. Upstream has tightened their scope to be AI-only, whereas
before it included a bunch of SEO and "web intelligence" marketing
stuff. I've kept those but moved them into their own section.
* [feature] Email change
* frontend stuff for changing email
* docs
* tests etc
* differentiate more clearly between local user+account and account
* populate user
This updates the robots.txt based on the list of the ai.robots.txt
repository. We can look at automating that at some point.
It's worth pointing out that some robots, namely the ones by Bytedance,
are known to ignore robots.txt entirely.
* update settings panels, add pending overview + approve/deny functions
* add admin accounts get, approve, reject
* send approved/rejected emails
* use signup URL
* docs!
* email
* swagger
* web linting
* fix email tests
* wee lil fixerinos
* use new paging logic for GetAccounts() series of admin endpoints, small changes to query building
* shuffle useAccountIDIn check *before* adding to query
* fix parse from toot react error
* use `netip.Addr`
* put valid slices in globals
* optimistic updates for account state
---------
Co-authored-by: kim <grufwub@gmail.com>
* [feature] User sign-up form and admin notifs
* add chosen + filtered languages to migration
* remove stray comment
* chosen languages schmosen schmanguages
* proper error on local account missing
* [feature] User-selectable preset themes
* docs, more theme stuff
* lint, tests
* fix css name
* correct some little issues
* add another theme
* fix poll background
* okay last theme i swear
* make retrieval of apimodel themes more conventional
* preallocate stylesheet slices
* [chore] Refactor HTML templates and CSS
* eslint
* ignore "Local"
* rss tests
* fiddle with OG just a tiny bit
* dick around with polls a bit more so SR stops saying "clickable"
* remove break
* oh lord
* don't lazy load avatar
* fix ogmeta tests
* clean up some cruft
* catch remaining calls to c.HTML
* fix error rendering + stack overflow in tag
* allow templating attributes
* fix indent
* set aria-hidden on status complementary content, since it's already present in the label anyway
* tidy up templating calls a little
* try to make styling a bit more consistent + readable
* fix up some remaining CSS issues
* fix up reports
* update go text, include text/display
* [feature] Set instance langs, show post lang on frontend
* go fmt
* WebGet
* set language for whole article, don't use FA icon
* mention instance languages + other optional config vars
* little tweak
* put languages in config properly
* warn log language parse
* change some naming around
* tidy up validate a bit
* lint
* rename LanguageTmpl in template
* deinterface router, start messing about with deadlines
* weeeee
* thanks linter (thinter)
* write Connection: close when timing out requests
* update wording
* don't replace req
* don't bother with fancy Cause functions (I'll use them one day...)
* [feature] Block Google Bard/AI crawlers
* [feature] Block the other OpenAI crawler
* [feature] Block Common Crawl crawler
This is used in research, but also gleefully advertises itself as the
training source used in all LLMs and GPT-3.
Fixes: #2240
* [feature] Block Omgilikebot
Used by some shady big web data engine company.
* [feature] Block Meta's language model crawler
* [feature] Block well-known.dev crawler
* add automatic cache max size generation based on ratios of a singular fixed memory target
Signed-off-by: kim <grufwub@gmail.com>
* remove now-unused cache max-size config variables
Signed-off-by: kim <grufwub@gmail.com>
* slight ratio tweak
Signed-off-by: kim <grufwub@gmail.com>
* remove unused visibility config var
Signed-off-by: kim <grufwub@gmail.com>
* add secret little ratio config trick
Signed-off-by: kim <grufwub@gmail.com>
* fixed a word
Signed-off-by: kim <grufwub@gmail.com>
* update cache library to remove use of TTL in result caches + slice cache
Signed-off-by: kim <grufwub@gmail.com>
* update other cache usages to use correct interface
Signed-off-by: kim <grufwub@gmail.com>
* update example config to explain the cache memory target
Signed-off-by: kim <grufwub@gmail.com>
* update env parsing test with new config values
Signed-off-by: kim <grufwub@gmail.com>
* do some ratio twiddling
Signed-off-by: kim <grufwub@gmail.com>
* add missing header
* update envparsing with latest defaults
Signed-off-by: kim <grufwub@gmail.com>
* update size calculations to take into account result cache, simple cache and extra map overheads
Signed-off-by: kim <grufwub@gmail.com>
* tweak the ratios some more
Signed-off-by: kim <grufwub@gmail.com>
* more nan rampaging
Signed-off-by: kim <grufwub@gmail.com>
* fix envparsing script
Signed-off-by: kim <grufwub@gmail.com>
* update cache library, add sweep function to keep caches trim
Signed-off-by: kim <grufwub@gmail.com>
* sweep caches once a minute
Signed-off-by: kim <grufwub@gmail.com>
* add a regular job to sweep caches and keep under 80% utilisation
Signed-off-by: kim <grufwub@gmail.com>
* remove dead code
Signed-off-by: kim <grufwub@gmail.com>
* add new size library used to libraries section of readme
Signed-off-by: kim <grufwub@gmail.com>
* add better explanations for the mem-ratio numbers
Signed-off-by: kim <grufwub@gmail.com>
* update go-cache
Signed-off-by: kim <grufwub@gmail.com>
* library version bump
Signed-off-by: kim <grufwub@gmail.com>
* update cache.result{} size model estimation
Signed-off-by: kim <grufwub@gmail.com>
---------
Signed-off-by: kim <grufwub@gmail.com>
* update go-fed
* do the things
* remove unused columns from tags
* update to latest lingo from main
* further tag shenanigans
* serve stub page at tag endpoint
* we did it lads
* tests, oh tests, ohhh tests, oh tests (doo doo doo doo)
* swagger docs
* document hashtag usage + federation
* instanceGet
* don't bother parsing tag href
* rename whereStartsWith -> whereStartsLike
* remove GetOrCreateTag
* dont cache status tag timelineability
* catch SQLITE_BUSY errors, wrap bun.DB to use our own busy retrier, remove unnecessary db.Error type
Signed-off-by: kim <grufwub@gmail.com>
* remove dead code
Signed-off-by: kim <grufwub@gmail.com>
* remove more dead code, add missing error arguments
Signed-off-by: kim <grufwub@gmail.com>
* update sqlite to use maxOpenConns()
Signed-off-by: kim <grufwub@gmail.com>
* add uncommitted changes
Signed-off-by: kim <grufwub@gmail.com>
* use direct calls-through for the ConnIface to make sure we don't double query hook
Signed-off-by: kim <grufwub@gmail.com>
* expose underlying bun.DB better
Signed-off-by: kim <grufwub@gmail.com>
* retry on the correct busy error
Signed-off-by: kim <grufwub@gmail.com>
* use longer possible maxRetries for db retry-backoff
Signed-off-by: kim <grufwub@gmail.com>
* remove the note regarding max-open-conns only applying to postgres
Signed-off-by: kim <grufwub@gmail.com>
* improved code commenting
Signed-off-by: kim <grufwub@gmail.com>
* remove unnecessary infof call (just use info)
Signed-off-by: kim <grufwub@gmail.com>
* rename DBConn to WrappedDB to better follow sql package name conventions
Signed-off-by: kim <grufwub@gmail.com>
* update test error string checks
Signed-off-by: kim <grufwub@gmail.com>
* shush linter
Signed-off-by: kim <grufwub@gmail.com>
* update backoff logic to be more transparent
Signed-off-by: kim <grufwub@gmail.com>
---------
Signed-off-by: kim <grufwub@gmail.com>
* [bugfix] Set Vary header correctly on cache-control
* Prefer activitypub types on AP endpoints
* use immutable on file server, vary by range
* vary auth on Accept
* [bugfix] Tidy up rss feed serving; don't error on empty feed
* fall back to account creation time as rss feed update time
* return feed early when account has no eligible statuses
* [bugfix] Fix NegotiateAccept with multi accept
There's a bug in Gin's NegotiateFormat that doesn't handle the presence
of multilpe accept headers. This lifts the code from the PR @tsmethurst
sent a year ago to Gin into our codebase to fix the issue.
* [bugfix] Concat accept header in webfinger
Some implementations bug out when there's multiple accept headers,
including Gin (see 7050112af1). But things
seem to work reliably with a single accept header with multiple parts.
Fixes: #1793
* [chore] Remove years from all license headers
Years or year ranges aren't required in license headers. Many projects
have removed them in recent years and it avoids a bit of yearly toil.
In many cases our copyright claim was also a bit dodgy since we added
the 2021-2023 header to files created after 2021 but you can't claim
copyright into the past that way.
* [chore] Add license header check
This ensures a license header is always added to any new file. This
avoids maintainers/reviewers needing to remember to check for and ask
for it in case a contribution doesn't include it.
* [chore] Add missing license headers
* [chore] Further updates to license header
* Use the more common // indentend comment format
* Remove the hack we had for the linter now that we use the // format
* Add SPDX license identifier
* start fiddling
* the ol' fiddle + update
* start working on fetching statuses
* poopy doopy doo where r u uwu
* further adventures in featuring statuses
* finishing up
* fmt
* simply status unpin loop
* move empty featured check back to caller function
* remove unnecessary log.WithContext calls
* remove unnecessary IsIRI() checks
* add explanatory comment about status URIs
* change log level to error
* better test names
* implement status pin client api + web handler
* make test names + comments more descriptive
* don't use separate table for status pins
* remove unused add + remove checking
* tidy up + add some more tests
This changes parseDescription to properly encode things to be safe for
usage without removing things like backslashes that may be relevant.
* text.SanitizePlaintext already calls html.UnescapeString so we don't
have to do that
* Replace \n with space early
* Remove duplicate white-space by splitting on fields and joining
* HTML-escape the string we have
* For extra certainty, encode the backslash as \
Fixes#1549
This adds a lightweight form of tracing to GTS. Each incoming request is
assigned a Request ID which we then pass on and log in all our log
lines. Any function that gets called downstream from an HTTP handler
should now emit a requestID=value pair whenever it logs something.
Co-authored-by: kim <grufwub@gmail.com>
* interim commit: start refactoring middlewares into package under router
* another interim commit, this is becoming a big job
* another fucking massive interim commit
* refactor bookmarks to new style
* ambassador, wiz zeze commits you are spoiling uz
* she compiles, we're getting there
* we're just normal men; we're just innocent men
* apiutil
* whoopsie
* i'm glad noone reads commit msgs haha :blob_sweat:
* use that weirdo go-bytesize library for maxMultipartMemory
* fix media module paths
* convert most of the caches to use result.Cache{}
* add caching of emojis
* fix issues causing failing tests
* update go-cache/v2 instances with v3
* fix getnotification
* add a note about the left-in StatusCreate comment
* update EmojiCategory db access to use new result.Cache{}
* fix possible panic in getstatusparents
* further proof that kim is not stinky