Help replacing quotes, ellipses, em dasheswith HTML named entities


I’m a writer, trying to convert my book (rtf) to ebook format. I know very little about HTML and am following directions from a website. I’m to the point of having pasted my text into Komodo. (I’m using the free version.) The guy writing the website uses TextMate on a Mac; I use a PC. He indicates I should be able to select what he calls “Convert Selection to Entities excluding Tags” to automatically convert ellipses, quotes, and em dashes to named HTML throughout the manuscript. Is there a way to do that in Komodo? I have been searching for an answer, but I know so little that I can’t do a proper search.

Any help would be deeply appreciated.

hey @jdean,

I don’t know about anyone else but I need a specific example of what you’re starting with and what you want to end up with. Maybe just one or two example would be helpful.

  • Carey

I think he means converting — to —, « to «, to • for example

Can you link to the guide that you are following?

Thanks for the replies. Yes, I would be replacing characters such as a dash with &mdash–but the website warns me not to use numbers such as &#171, which can cause formatting problems. That’s why he calls them named entities.

Rather than post a link, I’ll copy the relevant portion below–it’s a multi-part series with fairly long chunks of text in each part–a lot to read through. But if you want to see the series, a search for “Guido Henkel” and “ebook” will take you to it.


The next step for us to do is to replace all special characters with their proper HTML entities. There is a very safe way to handle this in HTML that will properly display on every HTML device, regardless of font or text encoding. The key to success lies in HTML’s named entities.

If we take the ellipses (…), for example, in HTML there is a special code that tells the device to draw that particular character. It is called (&hellip); With this entity, the device knows to draw an ellipse that cannot be broken into parts and is treated as a single character.

If you use the entity (&mdash) the device will render a proper em dash. Proper length, proper size and all.

Next up are quotes. For that purpose, HTML offers (&ldquo) and (&rdquo) , entities that represent curly left and right double quotes. Correspondingly, (&lsquo) and (&rsquo) are the entities to draw curly single quotes.

If you happen to see something like this in your HTML code (&#175) you know you’re asking for trouble, so make sure to use named entities only!

The brute force approach would be to search and replace all of them by hand, one entity at a time. This is not only time consuming but also prone to error, as you could all too easily overlook some in your text — but it may be the only option available to you.

The second — and easier way — is to automate the process. TextMate, the programming editor I am using, has a function called “Convert Selection to Entities excluding Tags” and it does exactly what we need. With it, it takes me one mouse-click to have all special characters in my entire book converted to named entities. Remember, using the right tools for the job will always make your life easier!

I had to edit my post, putting parenthesis around the name entity commands so they would show as names instead of ellipses, dashes and quotes. Hope that didn’t confuse. And I’ve just realized I should have put a semicolon at the end of each of the commands: (&mdash:)

Your guide is referring to a different editor. Komodo does not come with this functionality built in (least as far as I am aware) but you can easily implement this through macro’s.

In fact it seems like someone already has done this and has shared his code on github:

Alternatively, for a more complete solution you could try the HTML Tools addon, which also seems to offer this functionality (amongst other things).

Thanks so much! I’m going to give the HTML Tools addon a try. Having never worked with macros, it looks like it might be a bit less difficult to implement. If it works, you’ve just made my life a whole lot easier. I appreciate you being kind to an amateur.