A lot of it boils down to most fedi software not being “native” and only having federation designed more-or-less as an afterthought addition on top of a traditional centralized-ish system (even for ones that have federation from the get-go). Meaning you make assumptions like “it’s fine if I deletes the replies of a post if the post gets deleted”.
This, combined with how much data you can’t re-load and have to track as it comes in (e.g. nobody implements the necessary collections to backfill who liked or boosted what from it’s source, so you have to track that implicitly through Like and Announce activities), makes it extremely infeasible to implement while keeping the same user experience. Hell, even reply collections needed to backfill missing replies are a rarity (though a lot more common than the others given Mastodon implements them).
Additionally, people want the same user experience they’re used to in centralized systems, like search actually searching through everyone, globally. This is something I believe AP simply isn’t “intended” for. ATProto, for example, is much better in this specific regard (but comes at it’s own hefty costs, as an implementor).
I don’t blame the implementors for doing things this way. IMO it’s better to partially implement something like AP as an extension, as opposed to diving in head first into being AP-native. The standards are extremely vague and incomplete once you start looking below the shallow surface, and this way at least if a better protocol comes by migration (or multi-protocol federation) won’t be too difficult compared to if your source of truth was the same AS2 data you federated, the way AP intended you to.
not much beyond “look at what other apps you’re trying to interoperate with output and try to reverse engineer your way through”. reading through the sources of other apps may be a good idea.
some links that may get you started, picked from https://socialhub.activitypub.rocks/t/guide-for-new-activitypub-implementers/479 :
and depending on which ecosystem you’re targeting:
counter intuitively, avoid reading the specs if you’re looking to federate with existing software. the official specs are… extremely lacking beyond giving you the bullets to shoot yourself in the foot with (half of what little it defines goes unused in the real world, important things like “how do i know this activity is sent by the person it claims to be” is completely undefined (hint: everyone has more or less settled on http signatures).
once you get something federating, you can then look in the specs in an attempt to learn the concepts in depth, but writing code following the specs will result in code that simply won’t federate.