Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strip newlines from AP Article object HTML output #98

Closed
wants to merge 2 commits into from

Conversation

dariusk
Copy link
Contributor

@dariusk dariusk commented Apr 30, 2019

Some federated services, like Mastodon, interpret newlines in Html literally. Since it costs nothing (and saves a few bytes) to strip newlines from the HTML output, of the Article object, this commit does just that.

What posts look like in Mastodon (modified to have Article accepted) before:

image

And after this PR:

image


  • I have signed the CLA

Some federated services, like Mastodon, interpret newlines in Html
literally. Since it costs nothing (and saves a few bytes) to strip
newlines from the HTML output, of the Article object, this commit does
just that.
@dariusk dariusk mentioned this pull request Apr 30, 2019
1 task
@thebaer
Copy link
Member

thebaer commented Apr 30, 2019

As I mentioned in #97, I'd definitely like to make this work. But ideally, I'd like to preserve newlines for the cases where they're used for creative effect, as they often are on Write.as / WriteFreely.

One case would be when writing poetry -- someone might use single, not double, line breaks (e.g. this post). In those cases, we don't insert <br /> tags or any other markup indicating the single newline. So ideally, this would only strip newlines between HTML block elements (h1, h2, p, etc.) in the ActivityPub data. And that would have the added bonus of preserving single line breaks in the body text when the data shows up on Mastodon.

@dariusk
Copy link
Contributor Author

dariusk commented Apr 30, 2019

That's fair. Anyway, the actual issue here is that Mastodon runs simple_format on all incoming text payloads including HTML (this transforms all single-newlines into <br>s) so I think I will just fix that instead. This is all for an experimental fork I'm running anyway to support Article objects; my hope is to get some of this into upstream but it might be fully rejected.

@dariusk dariusk closed this Apr 30, 2019
@thebaer
Copy link
Member

thebaer commented Apr 30, 2019

Ah, gotcha. Seems kind of crazy that they'd run it on HTML as well.

Always happy to consider alternatives on our end, if it'll help.

@dariusk
Copy link
Contributor Author

dariusk commented Apr 30, 2019

Oh, they don't actually run it on HTML. It's why they convert everything to a stub! So there's no html at all. I'm working on a branch that accepts a subset of HTML (strong, em, and other "safe" tags) and am trying to shake out the issues from that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants