Jump to content

The Book Scanning Project


Recommended Posts

As I have indicated in some other posts, I am doing a project for a local second-hand book store (Good Day Books). The aim is to get some 50,000 books into a database, so that customers who don't live in central Tokyo can order the books online.

The project requires a good deal of website scripting, but I have never done anything such before. (I have mostly done application & system programming on mainframe, Unix and Windows systems.)

So while search for existing JavaScript function and developing my own scripting, I thought that I would document my findings here at the forum. It may some day come handy for some other people's projects.

The first part of the project is to build an online application to enter each book. If the books has a barcode with the ISBN or EAN, then it is scanned with a barcode reader. Otherwise the ISBN is entered manually, then the script will go to isbndb.com to retrieve author, title, and other information. Books published before 1970 do not have any ISBN and must be entered completely by hand.

The second part will be to build an online search and display the results, with an option to buy the book(s) [online shopping cart].

Link to comment
Share on other sites

Let's start with a simple one: trim the right side of a string from unwanted characters. The barcode scanner always returns a linefeed at the end of the code, and the data returned from isbndb.com often has some blank characters (or even a comma) at the end of the string. I use the rtrim function (found via Google) to remove these characters.

Function code:


function rTrim(s)
{
var r = s.length;
while(r > 0 && (s[r-1] == ' ' || s[r-1] == '\n' || s[r-1] == ','))
{ r -= 1; }
return s.substring(0, r);
}

Usage example:


isbn_val = rTrim(isbn_val);

That function, as originally posted, did not work - it left one character untrimmed if the string consisted only of a blank character.

Update:

much easier trim functionality using regexp replace, as in these examples


function ltrim(str)
{
return str.replace(/^\s+/, '');
}
function rtrim(str)
{
return str.replace(/\s+$/, '');
}
function alltrim(str)
{
return str.replace(/^\s+|\s+$/g, '');
}

Link to comment
Share on other sites

Books can come with either or both the 10-digit ISBN, or the 13-digit EAN. I will need both in the database for search purposes, so I will need conversion routines between the two. I found both functions via Google, and modified them slightly.

Function code:


function isbn_to_ean(isbn)
{
isbn = isbn.replace(/-/g, ''); // remove any hyphens
ean = '978' + isbn.toString(); // prefix the default for isbn-to-13 numbers
ean = ean.substring(0, 12); // get the first 12 numbers in the string
weight = new Array(1,3,1,3,1,3,1,3,1,3,1,3); // make an array of weights to multiply with
var total = 0;
for ( var i=0; i<12; i++ ) // loop the string and build the total with the weights array
{
t = ean.charAt(i);
total += t * weight[i];
}
check = (total % 10); // the check number is the modulus...
check = 10 - check; // ... subtrackted from 10
if ( check == 10 ) check = 0; // and perhaps one decimal level back
return ean + check.toString();
}


function ean_to_isbn(isbn)
{
if (isbn.substr(0,3) == "978")
{
isbn = isbn.substr(3,9);
var xsum = 0;
var add = 0;
var i = 0;
for (i = 0; i < 9; i++)
{
add = isbn.substr(i,1);
xsum += (10 - i) * add;
}
xsum %= 11;
xsum = 11 - xsum;
if (xsum == 10) { xsum = "X"; }
if (xsum == 11) { xsum = "0"; }
return isbn + xsum;
}
}

Usage examples:


isbn_val = ean_to_isbn(ean_val);

ean_val = isbn_to_ean(isbn_val);

Link to comment
Share on other sites

Here is a function that I especially like; it converts accented characters into plain unaccented characters (so that Françoise Sagan is found even if someone types Francoise). Found it via Google, but modified it to include romanized Japanese special characters like ō or ū.

Function code:


function handle_accent(instr)
{
var r = instr;
r = r.replace(new RegExp(/[ÀÁÂÃÄÅ]/g),'A');
r = r.replace(new RegExp(/[Æ]/g),'AE');
r = r.replace(new RegExp(/[Ç]/g),'C');
r = r.replace(new RegExp(/[ÈÉÊË]/g),'E');
r = r.replace(new RegExp(/[ÌÍÎÏ]/g),'I');
r = r.replace(new RegExp(/[Ñ]/g),'N');
r = r.replace(new RegExp(/[ÒÓÔÕÖØ]/g),'O');
r = r.replace(new RegExp(/[ÙÚÛÜ]/g),'U');
r = r.replace(new RegExp(/[Ý]/g),'Y');
r = r.replace(new RegExp(/[ß]/g),'ss');
r = r.replace(new RegExp(/[àáâãäå]/g),'a');
r = r.replace(new RegExp(/[æ]/g),'ae');
r = r.replace(new RegExp(/[ç]/g),'c');
r = r.replace(new RegExp(/[èéêë]/g),'e');
r = r.replace(new RegExp(/[ìíîï]/g),'i');
r = r.replace(new RegExp(/[ñ]/g),'n');
r = r.replace(new RegExp(/[òóôõöø]/g),'o');
r = r.replace(new RegExp(/[ùúûü]/g),'u');
r = r.replace(new RegExp(/[ýÿ]/g),'y');
r = r.replace(new RegExp(/[\u0100\u0102]/g),'A'); // A w/macron, breve
r = r.replace(new RegExp(/[\u0101\u0103]/g),'a'); // a w/macron, breve
r = r.replace(new RegExp(/[\u0106\u0108\u010A\u010C]/g),'C'); // C w/acute, circumflex, dot, breve
r = r.replace(new RegExp(/[\u0107\u0109\u010B\u010D]/g),'c'); // c w/acute, circumflex, dot, breve
r = r.replace(new RegExp(/[\u0112\u0116\u011A]/g),'E'); // E w/macron, dot, caron
r = r.replace(new RegExp(/[\u0113\u0117\u011B]/g),'e'); // e w/macron, dot, caron
r = r.replace(new RegExp(/[\u011C\u011E\u0120]/g),'G'); // G w/circumflex, breve, dot
r = r.replace(new RegExp(/[\u011D\u011F\u0121]/g),'g'); // g w/circumflex, breve, dot
r = r.replace(new RegExp(/[\u0128\u012A\u012C\u0130]/g),'I'); // I w/tilde, macron, breve, dot
r = r.replace(new RegExp(/[\u0129\u012B\u012D\u0131]/g),'i'); // i w/tilde, macron, breve, dotless
r = r.replace(new RegExp(/[\u0143\u0145\u0147]/g),'N'); // N w/acute, cedilla, caron
r = r.replace(new RegExp(/[\u0144\u0146\u0148]/g),'n'); // n w/acute, cedilla, caron
r = r.replace(new RegExp(/[\u014C\u014E\u0150]/g),'O'); // O w/macron, breve, double acute
r = r.replace(new RegExp(/[\u014D\u014F\u0151]/g),'o'); // o w/macron, breve, double acute
r = r.replace(new RegExp(/[\u0152]/g),'OE'); // ligature OE
r = r.replace(new RegExp(/[\u0153]/g),'oe'); // ligature oe
r = r.replace(new RegExp(/[\u0154\u0156\u0158]/g),'R'); // R w/acute, cedilla, caron
r = r.replace(new RegExp(/[\u0155\u0157\u0159]/g),'r'); // r w/acute, cedilla, caron
r = r.replace(new RegExp(/[\u015A\u015C\u015E\u0160]/g),'S'); // S w/acute, circumflex, cedilla, caron
r = r.replace(new RegExp(/[\u015B\u015D\u015F\u0161]/g),'s'); // s w/acute, circumflex, cedilla, caron
r = r.replace(new RegExp(/[\u0168\u016A\u016C\u0170]/g),'U'); // U w/tilde, macron, breve, double acute
r = r.replace(new RegExp(/[\u0169\u016B\u016D\u0171]/g),'u'); // u w/tilde, macron, breve, double acute
r = r.replace(new RegExp(/[\u0174]/g),'W'); // W w/circumflex
r = r.replace(new RegExp(/[\u0175]/g),'w'); // w w/circumflex
r = r.replace(new RegExp(/[\u0176\u0178]/g),'Y'); // Y w/circumflex, diaeresis
r = r.replace(new RegExp(/[\u0177]/g),'y'); // y w/circumflex
r = r.replace(new RegExp(/[\u0179\u017B\u017D]/g),'Z'); // Z w/acute, dot, caron
r = r.replace(new RegExp(/[\u017A\u017B\u017E]/g),'z'); // z w/acute, dot, caron
if (r == instr)
return "";
else
return r;
}

Usage example:


alt_author_val = handle_accent(author_val);

Update 1:

Added many more characters after finding names like Joseph Grčić, Søren Kierkegaard, Slavoj Žižek, and - get this: Çiğdem Kağıtçıbaşı !

Update 2:

Added even more characters; my script is now probably the most comprehensive on the Internet! (A total of 112 characters replaced.)

Link to comment
Share on other sites

I have updated the handle_accent() function above, as I have discovered several other characters that were used in authors' names (like Ĉ, ć, č).

I am still working on a function that extracts the octet stream of images, so that it can be stored in the database. I will publish it here once I get it to work.

Link to comment
Share on other sites

Here is a little function that checks or unchecks all checkboxes (with a certain name) on a page. This is useful for my project when I want to delete a bunch of items from the database.

Function code:


function CheckAll(action)
{
const n = document.getElementsByName('del_box').length;
for (i = 0; i < n; i++)
document.getElementsByName('del_box')[i].checked = (action == 'Y')?true:false;
}

Usage example:


<input type="button" name="Check_All" value="Check All" onClick="javascript:CheckAll('Y')"> listed
<input type="button" name="Un_CheckAll" value="Uncheck All" onClick="javascript:CheckAll('N')"> listed
<!-- other html -->
<div id="results">
<!-- other html -->
<input type="checkbox" name="del_box" value="Delete" />
<!-- other html -->
<input type="checkbox" name="del_box" value="Delete" />
<!-- other html -->
<input type="checkbox" name="del_box" value="Delete" />
<!-- other html -->
<input type="checkbox" name="del_box" value="Delete" />
<!-- other html -->
</div>

Link to comment
Share on other sites

Now in continuation of the above post I needed a script that goes through all the checkboxes, then counts the ones that are checked, and issues a confirmation message in a dynamically created form on the page. If the operator clicks the Delete key, then the items are purged from the database. If the operator clicks the Reset key, then I want to remove the dynamic form from view.

It was most interesting to create these scripts...

Function code:


function goFindAll(action)
{
const n = document.getElementsByName('del_box').length;
var count = 0;
var newitem = '';
for (var i = 0; i < n; i++)
if (document.getElementsByName('del_box')[i].checked)
count += 1;

div=document.getElementById('dyndiv');
mybutton=document.getElementById('addButton');
if (action == 'D')
newitem = '<br /><span class=\"error\">Are you sure you want to delete these ' + count + ' Book(s)?</span><br /><br />';
else
newitem = '<br /><span class=\"error\">Are you sure you want to hide these ' + count + ' Book(s)?</span><br /><br />';
if (action == 'D')
newitem += '<input type=\"submit\" name=\"doDeleteAll\" value=\"Delete All\" class=\"small_submit\" /> ';
else
newitem += '<input type=\"submit\" name=\"doHideAll\" value=\"Hide All\" class=\"small_submit\" /> ';
newitem += '<input type=\"button\" name=\"back\" value=\"Reset\" class=\"small_submit\" onClick=\"javascript:ResetAll()\" /><br /><br />';
newitem += '<hr color=\"#cccccc\" />';
newnode = document.createElement("span");
newnode.innerHTML = newitem;
mybutton.parentNode.insertBefore(newnode, mybutton);
}

function ResetAll()
{
var div = document.getElementById('dyndiv');
var child = div.firstElementChild;
div.removeChild(child);
}

Usage example:


<input type="submit" name="deleteAll" value="Delete" class="submit" onClick="javascript:goFindAll('D')" /> all checked<br />
<input type="submit" name="hideAll" value="Hide" class="submit" onClick="javascript:goFindAll('H')" /> all checked<br />
<!-- other html -->
<form name="dynform">
<div id="dyndiv">
<input type="hidden" id="addButton" />
</div>
</form>

Link to comment
Share on other sites

I needed a way to keep one piece of information from one entry panel to the next, specifically the category where a book is physically kept. This is not normally possible, as all data - HTML and JavaScript data, even global variables, are lost over a page refresh. The only way of doing this is via a short-term cookie. Here is the script that resolved it.

Function code:


function getcookievalue()
{
start = document.cookie.indexOf('category=');
end = document.cookie.indexOf(';', start); // First ; after start
if (start == -1) // 'category=' not found
return '';
if (end == -1) // ';' not found
end = document.cookie.length;
return document.cookie.substring(start+10, end-1);
}

Usage example:


if (category_val != '')
document.cookie = 'category=\"' + category_val + '\"';
else
{
category_val = getcookievalue();
if (category_val != '')
document.getElementById('new_category').value = category_val;
}

Update:

The original code did not work when the cookie value consisted of more than one word; the cookie value needs to be placed between double quotes. The scripts above have been corrected.

Link to comment
Share on other sites

And now for something completely different: let's have a bit of SQL...

All data are stored in a PostgreSQL database table. So far, when searching for books using author name, book title, subject, or ISBN, the underlying query has returned the data ordered by author's name. This is no longer practical, as the authors' names we get from isbndb.com are mostly in First Name - Last Name order, and often prefixed with [by] or [ed] or [tr].

What I wanted was that the data are returned in order of relevance to the search keywords. My first idea was to employ a UNION of multiple queries, but this proved cumbersome, and it also returned too many rows, with some of them duplicate.

So I did some research and came up with a concept called Tsearch2 that has been implemented into PostgreSQL. It took me a while to understand how the thing works, but at the end I came up with a query form that returns exactly the results I want, in the correct order of relevance.

Here is the query, where keyword1, keyword2, ... are the search keywords:


SELECT book_id, btitle, author, pub, yearp, isbn, com, cond, price, photo, hide_book, category, ean,
ts_rank(to_tsvector("btitle"||"alt_btitle"||"author"||"alt_author"||"com"||"pub"||"isbn"||"ean"), to_tsquery('keyword1 | keyword2 | ...')) AS rank
FROM gdb_books
WHERE to_tsquery('keyword1 | keyword2 | ...') @@ to_tsvector("btitle"||"alt_btitle"||"author"||"alt_author"||"com"||"pub"||"isbn"||"ean")
ORDER BY rank DESC;

Link to comment
Share on other sites

  • 2 weeks later...

How to crop an image in HTML, or dynamically with JavaScript? I have been searching the whole afternoon, and there are countless misinformation all over the Internet.

Example; I have an image like this

<img style="border:1px solid black;" src=&quot;http://images.amazon.com/images/P/0679430881.01._AA175_PU_PU-5_.jpg" />

but I need to crop some of the white space around it when displaying it. Simple, use the margin style attribute, like

<div id="pat" style="overflow:hidden;">
<img style="margin-left:-31px; margin-right:-32px;" src="..." />
</div>

This results in

<div style="overflow:hidden; width:150px;">

<img style="border:1px solid black; margin-left:-31px; margin-right:-32px;" src=&quot;http://images.amazon.com/images/P/0679430881.01._AA175_PU_PU-5_.jpg" />

</div>

This can also easily done dynamically using JavaScript, e.g.

    var imgurl = '...';
document.getElementById('pat').getElementsByTagName('img')[0].style.margin = '-3px -16px 0px -15px';
document.getElementById('pat').getElementsByTagName('img')[0].src = imgurl;

The margins in the above code stand for top, right, bottom, left.

Link to comment
Share on other sites

A little bit more HTML...

Here is my input panel

post-1524-0-16720300-1294712911_thumb.pn

And I wanted to have a border around the Category box in the lower right corner. I struggled several hours (!) to finally get it to look how I wanted it. (First the yellow border was half hidden behind the text box, and then I couldn't get any distance from the boxes above.)

The first thing I had to learn about a border are the terms padding and margin. Padding will add some space between the border and the item(s) inside the border. Margin will add some space outside of the border.

The second thing I found out after the thing never seemed to work 100% was that I needed to place these style elements into a <div> tag, and not a <span> tag that I used originally.

Here is the complete HTML code for that bordered little box


<div style="border-style:solid; border-width:2px; border-color:gold; margin-top:3px; padding-top:1px; padding-bottom:1px;">
<label for="new_category"><b>Cat </b><input type="text" id="new_category" name="new_category" class="input" size="15" value="{cookie}" /></label>
</div>

Link to comment
Share on other sites

Nice work Pat.

As I know nothing at all about this type of thing.(Coding, HTML, etc; etc.)

Can I ask why it is not possible in the case of the image with an unwanted border, to just crop the image in a photoeditor beforehand instead of having to resort to writing script ?

To the uninitiated it seems a long route just to crop a photo. Very obviously it must be the correct route, otherwise you would not have spent a whole afternoon researching it.

Just curious.

John.

Link to comment
Share on other sites

Good question - I should have explained...

The image does not belong to me, and it is not hosted on my server. I also do not know beforehand what image will be used; it will be displayed beside the book search results.

I just know that the image will be too wide, and therefore I want to crop it dynamically.

Link to comment
Share on other sites

We all have an inkling that Internet Explorer is different from other browsers, but especially web developers know that there are big differences between these browsers, and that great care must be taken that websites display the same way on every browser.

I had developed the book management site completely on and for Firefox, so I was astonished when I found that one of our users is using IE, and that my input form looked completely different.

A textbox using rows="1" displays a height of 1 row on IE, but 2 rows on FF. Stupid, but that's how it is. So I spent another afternoon searching for a way to make the form the same on both browsers.

It's not easy to do browser checks in hard-coded HTML, so I found a solution using DTML (the website is based on Zope, which uses a special language called DTML).

Here is the code that changes the rows value depending on the browser:


<textarea id="new_isbn" name="new_isbn" class="input" rows="<dtml-if "HTTP_USER_AGENT.find('MSIE')>=0">2<dtml-else>1</dtml-if>" cols="12"><dtml-var isbn missing=""></textarea>

I also noticed that the little JavaScript ScrollTo I did on Firefox (to make the input form completely visible) does not work on IE. The solution is all over the Internet, but I used the same technique as above to make it work:


<dtml-if "HTTP_USER_AGENT.find('MSIE')>=0">
setTimeout('window.scrollTo(0,150)',1);
<dtml-else>
window.scrollTo(0,150);
</dtml-if>

Link to comment
Share on other sites

  • 1 month later...

Now that we have some 10,000 books in the database, I find that a lot of them (30% - 40%) do not have a barcode. While entering the book, this is not a big deal - the ISBN can be typed manually. Books that were printed before 1970 do not even have an ISBN, and all data are entered manually.

However, when selling these books at the counter, they must be eliminated from the database. Books with a barcode are done within a second (with the scanner), but books without must be noted down manually. This is very awkward when somebody buys 10 or 15 books...

So I have decided that we must print our own labels with barcodes on it:

  • the ISBN if available
  • or our own internal number (database key) if there is no ISBN

Examples:

post-1524-0-18545400-1299565430_thumb.pn post-1524-0-66291300-1299565463_thumb.pn

Extracting the text data on top from the database is easy enough, but I have found that printing a barcode is not as straightforward as printing text:

  1. you need a barcode font
  2. you need to encode the number to make the barcode scannable

It took me a while to find a free barcode font (it requires a EAN13 barcode font), but I found it eventually at http://sourceforge.net/projects/openbarcodes/files/

Encoding the ISBN or our internal numbers took more research; the basic information are at http://en.wikipedia.org/wiki/EAN-13, with additional info at http://www.barcodeisland.com/ean13.phtml and http://www.barcodeisland.com/ean8.phtml

So at the end all I needed to do is to write some small JavaScript functions to encode those numbers. (The ISBN already contains the required check digit, but our internal number needs to calculate a check digit.)


function EAN13(ean)
{
var L=['A','B','C','D','E','F','G','H','I','J'];
var G=['K','L','M','N','O','P','Q','R','S','T'];
var R=['a','b','c','d','e','f','g','h','i','j'];
var n=0;
var s='9';

for (var i = 1; i < 13; i++)
switch (i)
{
case 1:
case 4:
case 6:
n = ean.charAt(i) - 0;
s += L[n];
break;
case 2:
case 3:
case 5:
n = ean.charAt(i) - 0;
s += G[n];
break;
case 7:
s += '*';
default:
n = ean.charAt(i) - 0;
s += R[n];
break;
}
s += '+';
return s;
}

function EAN8(book_id)
{
var L=['A','B','C','D','E','F','G','H','I','J'];
var R=['a','b','c','d','e','f','g','h','i','j'];
var n=0;
var s=':';
var ean8 = lpad(book_id+'',7);

ean8 += EAN8cd(ean8);

for (var i = 0; i < 8; i++)
switch (i)
{
case 0:
case 1:
case 2:
case 3:
n = ean8.charAt(i) - 0;
s += L[n];
break;
case 4:
s += '*';
default:
n = ean8.charAt(i) - 0;
s += R[n];
break;
}
s += '+';
return s;
}

function EAN8cd(ean)
{
var checkDigit = 10 - ((
3 * ean.charAt(0) +
1 * ean.charAt(1) +
3 * ean.charAt(2) +
1 * ean.charAt(3) +
3 * ean.charAt(4) +
1 * ean.charAt(5) +
3 * ean.charAt(6)) % 10);

if (checkDigit == 10)
return '0';
else
return checkDigit+'';
}

function lpad(number, length)
{
var str = '' + number;
while (str.length < length)
str = '0' + str;

return str;
}

Link to comment
Share on other sites

  • 9 months later...
  • 1 month later...

There is constantly work to be done on the project. All books are in the database now, but almost on a daily basis we found that some books are in the database, but we cannot actually find the book where it is supposed to be! So it's time to conduct an inventory; this will also help us to correct titles and author names, as well as identify books that still don't have a barcode label.

Writing database entries out to an Excel document is easy enough, but I encountered a strange problem: all accented characters were somehow transformed into some strange Chinese characters, e.g. John Le Carré ended up as John Le Carr鼯

I tried and tried, but I was unable to prevent that. I also searched the Internet, but it seems nobody has encountered this problem before.

I finally found that using the html entity equivalent (e.g. ä for 'ä') will solve my problem. So I wrote a little Python function (Python being much easier than JavaScript) that transforms all accented characters to their html enitity equivalent. Here is the function (instr is the input argument):


s = instr
s = s.replace('À','À') # 2nd param should be Agrave, surrounded by & and ;
s = s.replace('Á','Á') # 2nd param should be Aacute, surrounded by & and ;
s = s.replace('Â','Â') # 2nd param should be Acirc, surrounded by & and ;
s = s.replace('Ã','Ã') # 2nd param should be Atilde, surrounded by & and ;
s = s.replace('Ä','Ä')
s = s.replace('Å','Å') # 2nd param should be Aring, surrounded by & and ;
s = s.replace('Æ','Æ') # 2nd param should be AElig, surrounded by & and ;
s = s.replace('Ç','Ç') # 2nd param should be Ccedil, surrounded by & and ;
s = s.replace('È','È') # 2nd param should be Egrave, surrounded by & and ;
s = s.replace('É','É') # 2nd param should be Eacute, surrounded by & and ;
s = s.replace('Ê','Ê') # 2nd param should be Ecirc, surrounded by & and ;
s = s.replace('Ë','Ë')
s = s.replace('Ì','Ì') # 2nd param should be Igrave, surrounded by & and ;
s = s.replace('Í','Í') # 2nd param should be Iacute, surrounded by & and ;
s = s.replace('Î','Î') # 2nd param should be Icirc, surrounded by & and ;
s = s.replace('Ï','Ï')
s = s.replace('Ñ','Ñ') # 2nd param should be Ntilde, surrounded by & and ;
s = s.replace('Ò','Ò') # 2nd param should be Ograve, surrounded by & and ;
s = s.replace('Ó','Ó') # 2nd param should be Oacute, surrounded by & and ;
s = s.replace('Ô','Ô') # 2nd param should be Ocirc, surrounded by & and ;
s = s.replace('Õ','Õ') # 2nd param should be Otilde, surrounded by & and ;
s = s.replace('Ö','Ö')
s = s.replace('Ø','Ø') # 2nd param should be Oslash, surrounded by & and ;
s = s.replace('Ù','Ù') # 2nd param should be Ugrave, surrounded by & and ;
s = s.replace('Ú','Ú') # 2nd param should be Uacute, surrounded by & and ;
s = s.replace('Û','Û') # 2nd param should be Ucirc, surrounded by & and ;
s = s.replace('Ü','Ü')
s = s.replace('Ý','Ý') # 2nd param should be Yacute, surrounded by & and ;
s = s.replace('ß','ß') # 2nd param should be szlig, surrounded by & and ;
s = s.replace('à','à') # 2nd param should be agrave, surrounded by & and ;
s = s.replace('á','á') # 2nd param should be aacute, surrounded by & and ;
s = s.replace('â','â') # 2nd param should be acirc, surrounded by & and ;
s = s.replace('ã','ã') # 2nd param should be atilde, surrounded by & and ;
s = s.replace('ä','ä')
s = s.replace('å','å') # 2nd param should be aring, surrounded by & and ;
s = s.replace('æ','æ') # 2nd param should be aelig, surrounded by & and ;
s = s.replace('ç','ç') # 2nd param should be ccedil, surrounded by & and ;
s = s.replace('è','è') # 2nd param should be egrave, surrounded by & and ;
s = s.replace('é','é') # 2nd param should be eacute, surrounded by & and ;
s = s.replace('ê','ê') # 2nd param should be ecirc, surrounded by & and ;
s = s.replace('ë','ë')
s = s.replace('ì','ì') # 2nd param should be igrave, surrounded by & and ;
s = s.replace('í','í') # 2nd param should be iacute, surrounded by & and ;
s = s.replace('î','î') # 2nd param should be icirc, surrounded by & and ;
s = s.replace('ï','ï')
s = s.replace('ñ','ñ') # 2nd param should be ntilde, surrounded by & and ;
s = s.replace('ò','ò') # 2nd param should be ograve, surrounded by & and ;
s = s.replace('ó','ó') # 2nd param should be oacute, surrounded by & and ;
s = s.replace('ô','ô') # 2nd param should be ocirc, surrounded by & and ;
s = s.replace('õ','õ') # 2nd param should be otilde, surrounded by & and ;
s = s.replace('ö','ö')
s = s.replace('ø','ø') # 2nd param should be oslash, surrounded by & and ;
s = s.replace('ù','ù') # 2nd param should be ugrave, surrounded by & and ;
s = s.replace('ú','ú') # 2nd param should be uacute, surrounded by & and ;
s = s.replace('û','û') # 2nd param should be ucirc, surrounded by & and ;
s = s.replace('ü','ü')
s = s.replace('ý','ý') # 2nd param should be yacute, surrounded by & and ;
s = s.replace('ÿ','ÿ') # 2nd param should be yuml, surrounded by & and ;
return s

The forum editor keeps changing the entities above, so I have corrected it with some comments at the end of the lines.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

 Share

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue. Privacy Policy