• Getting IDLE to remove non-ASCII characters
    15 replies, posted
IDLE keeps telling me that there are non-ASCII characters in my script, and even though I've ctrl+H'ed my way through the entire document, switching out stuff that isn't in the ASCII table, it still complains. Is there an easy way to remove all non-ASCII characters? Copying the script to another editor which has extended capabilities in this area perhaps? Or does IDLE have a built-in function? Oh yeah, I'm on python 2.7.
That would be easy to do in C. [code]while((c = fgetc(ifptr)) != EOF) { if(c >= 32 && c <= 126) fputc(c, ofptr); }[/code] Oh, unless the encoding is what's non-ASCII. Do you save your Python scripts in UTF or something?
If you're running a UNIX-like OS, you could use the command "tr".
[QUOTE=Larikang;25410923]Do you save your Python scripts in UTF or something?[/QUOTE] Oh no, I'm doing the standard ASCII save. If I didn't, newlines and stuff like that wouldn't work. @ROBO_DONUT: Nope, Windows here. I think I'll just have to go through it again...
So by non-ASCII, I'm guessing it means non-printable ASCII?
Why not post your code here, so that we [pythonistas] may have a look-see? :)
[QUOTE=LemONPLaNE;25411246]@ROBO_DONUT: Nope, Windows here. I think I'll just have to go through it again...[/QUOTE] What if you tried iconv or something? There seems to be [url=http://gnuwin32.sourceforge.net/packages/libiconv.htm]a Windows version[/url]. [editline]15th October 2010[/editline] Oh wait, that's just the library, sorry.
Is the document UTF-8 encoded or something? You could try to save it as ANSI encoded in any decent text-editor. Microsofts Notepad should do.
ANSI isn't standard ASCII only. See, when IDLE sees non-ASCII in the document, it prompts me if I want to save in ANSI. Which I don't, because if you save in ANSI it treats ASCII unprintables (\n,\r,\t and the likes) as standard text, which makes my script print funny. [QUOTE=Chandler;25413492]Why not post your code here, so that we [pythonistas] may have a look-see? :)[/QUOTE] You'd claw your eyes out.
Eh, of course. Didn't think about that extended ASCII is not ASCII :P You could create a small program, which goes through each character in the file and reports and character that is < ' ' and > '~' and != '\t' and != '\n'. Just compile that via command line.
Or I could rewrite my comments in English. Which I did. Problem solved.
[QUOTE=LemONPLaNE;25417995]Or I could rewrite my comments in English. Which I did. Problem solved.[/QUOTE] Actually, the best thing you could do is save your files as UTF-8. It's fully compatible with pure ASCII, in the sense that all the ASCII characters are the same and no byte in any byte sequence for a UTF-8 character maps into an invalid ASCII character. Or so I've heard. [editline]15th October 2010[/editline] But commenting in English is still a good idea. I do it and I'm Finnish. :smile:
[QUOTE=LemONPLaNE;25411246]Oh no, I'm doing the standard ASCII save. If I didn't, newlines and stuff like that wouldn't work.[/QUOTE] Sure they would. UTF-8 is backwards compatible with the ASCII character set, which includes all the control characters of ANSI/ASCII.
You can actually make Python treat the file as UTF-8 by placing the following at the top of your file (below any crunchbang (#!) statements of course) [code] # -*- coding: utf-8 -*- [/code]
[QUOTE=Chandler;25443654]You can actually make Python treat the file as UTF-8 by placing the following at the top of your file (below any crunchbang (#!) statements of course) [code] # -*- coding: utf-8 -*- [/code][/QUOTE] kludge alert :siren:
[QUOTE=X'Trapolis;25449471]kludge alert :siren:[/QUOTE] What, that's not an inelegant or impractical solution that works surprisingly well. It's just a handy feature, I've used it a bunch.
Sorry, you need to Log In to post a reply to this thread.