Questions tagged [utf-8]

1

votes
1

answer
176

Views

write unicode data to mssql with python?

I'm trying to write a table from a .csv file with Hebrew text in it to an sql server database. the table is valid and pandas reads the data correct (even displays the hebrew properly in pycharm), but when i try to write it to a table in the database i get question marks ('???') where the Hebrew shou...
Dror Bogin
1

votes
2

answer
34

Views

How do print a sequence of characters (hex) as UTF8 characters in Perl?

I have a variable in Perl that is just a sequence of letters and digits: my $var = '48656c6c6f20576f726c64'; The sequence represents the string Hello World. I.e. one can think of it as 0x48 0x65 0x6c 0x6c 0x6f 0x20 0x57 0x6f 0x72 0x6c 0x64, the hex representation of Hello World. How do I print $var...
user2426316
1

votes
2

answer
31

Views

Convert characters using Python

I receive a text file, but some characters on it are not correct. One example is the text below: Apresentação/ divulgação do curso But the correct text is Apresentação/ divulgação do curso I use the Php function utf8_decode and it works, see example below echo utf8_decode('Apresent...
fabiobh
1

votes
1

answer
4.6k

Views

UnicodeEncodeError: 'cp949' codec can't encode character

How do I handle this? wfile.write(data['text']+'\n') UnicodeEncodeError: 'cp949' codec can't encode character import tweepy import time import os import json search_term1 = '' search_term2 = '' lat = '' lon = '' radius = '' location = '%s,%s,%s' % (lat, lon, radius) auth = tweepy.OAuthHandler(...
koko
1

votes
1

answer
49

Views

How is UTF-8 safe relative to ASCII chars

I was reading on Wikipedia, and came across the following: 'Since ASCII bytes do not occur when encoding non-ASCII code points into UTF-8, UTF-8 is safe to use within most programming and document languages that interpret certain ASCII characters in a special way, such as '/' in filenames, '\' in...
Learnerer
1

votes
1

answer
23

Views

Converting string to wstring fails in Visual Studio

The following code is intended to provide a function to convert string in utf-8 to utf-16. But it fails. How can I fix it to work in Visual Studio 2017 C++17: #define _SILENCE_ALL_CXX17_DEPRECATION_WARNINGS #include #include #include #include #include using namespace std; wstring utf8_to_uft16(...
edukadoj
1

votes
2

answer
61

Views

What encoding is the resulting string if I concatenate a UTF-8 encoded string with an ASCII string in PHP?

If I use the function mb_convert_encoding() to convert an ASCII encoded string in PHP to a UTF-8 string, then concatenate it with an ASCII encoded string, what encoding is it? Are there any negative consequences for doing this?
AndreasKralj
1

votes
1

answer
46

Views

How to make Perl's XML::Libxml serializer to use utf-8 encoding?

I'd like to serialize an xml with XML::LibXML, but it always converts utf-8 characters to html representations: I get á from 'á', etc... How can I make it use utf-8 instead? use strict; use XML::LibXML; use utf8; my $str = 'árvíztűrő tükörfúrógép'; my $dom = XML::LibXML->load_xml(string =...
lmocsi
1

votes
2

answer
63

Views

Convert utf-8 unicode sequence to utf-8 chars in Python 3

I'm reading data from an aws s3 bucket which happens to have unicode chars escaped with double backslashes. The double backslashes makes the unicode sequence parsed as a series of utf-8 characters instead of the character which the unicode represents. The example illustrates the situation. >>> s1='1...
Leonard Saers
1

votes
1

answer
50

Views

Encoding issues (german Umlaut) with Excel/Java/Json

I have a little program in Java which reads contents from an .xlsx file and writes some of it into a new .json file. In the .xlsx cells there are some strings with german Umlauts ('ä, ö, ü'). My problem: If the program is running on MacOS everything works fine. If the program is running on Window...
Renata Samà
1

votes
2

answer
5.7k

Views

C#: Load *.txt to RichTextBox and convert into UTF8

I want to open text files and load them into a RichTextBox. This has been going fine so far, but now I'm struggling with an encoding issue. So I used the GetType() method from this StackOverflow page: How to find out the Encoding of a File? C# - and it returns 'System.Text.UnicodeEncoding'. My quest...
Momro
1

votes
0

answer
19

Views

Fully supporting uft8 in PHP - security issues?

After going through many examples in this site, I have now made my website fully uft8 compliant. However, the webcollab.sourceforge.io says, Overly long UTF-8 sequences and UTF-16 surrogates are a serious security threat. Validation of input data is very important. An algorithm using preg_replace()...
Kiran
1

votes
2

answer
54

Views

How can be returned input register using only array lower case words in jquery textcomplete?

I use jquery textcomplete in my project. Is it possible to return the input register using a word array only in lowercase. I looked in Google Translate the approximate functional and there too the input result does not return at input. There, the search is performed with any word entry register, but...
John
1

votes
2

answer
784

Views

How to get Android Volley StringRequest GET to return responses in UTF-8 encoding

How can I get Android Volley StringRequest GET to return responses in UTF-8 encoding? It only seems to return responses in ISO-8859-1. Is it possible to get it to accept a UTF-8 string?
Barry
1

votes
1

answer
49

Views

How do I add characters to an encoded string? Python

I need to send a request from the backend I'm doing in Python to an API via URL. The method requests.get(URL) automatically makes the respective encoding to UTF-8. The problem is that the API page, when they are special characters, such as in Russian, German, Spanish, etc.... in the address goes th...
Andres Martinez
1

votes
1

answer
525

Views

blob with conversion to 8bit cp1251 or cp1252

I need a solution with encoding utf to 8-bit cp1251 or cp1252 using blob I managed to change the https://github.com/b4stien/js-csv-encoding including windows 1251, but there are insoluble problems: Unfortunately noscript does not allow loading external javascript on a page with scripts turned off v...
nnovic
1

votes
3

answer
77

Views

PHP Propel 1.6 and MySQL - save() using utf8 not working

I have installed Propel 1.6. I can create tables in MySQL with propel commands. Below is my propel settings in file: runtime-config.xml mysql mysql:host=localhost;dbname=myDBname myUser mypass **utf8 utf8_unicode_ci** MySQL database and table User has collation utf8_unicode_ci (see photo below): my...
billato
1

votes
1

answer
45

Views

python3 wirdt chars and utf-8 can't remove it

I have a problem, i trying to get a string to be equel in python3 and in mysql, the problem is i expect its shut be utf-8 but the problem is its not the same. i have this string stationær pc > stationær pc and what i wich now is its shut look like this stationr pc > stationr pc and i have...
ParisNakitaKejser
1

votes
2

answer
322

Views

UTF-8 encoding for XML with php and accent characters along with ENT_XML1

AN ongoing issue for over a year, That I though I had corrected but has evolved into a monster. I move large amounts of data between sites using XML generated on PHP systems. Mainly text I ran into some basic XML items that broke the transfer so I used this code of all XML values. $value=str_replac...
Radium Chris
1

votes
1

answer
399

Views

FFMPEG UTF-8 subtitles not properly displayed in mp4 file

I am trying to add UTF-8 Telugu subtitles to mp4 file using ffmpeg. The subtitles are not properly getting displayed. I am using the command, ffmpeg -i input.mp4 -vf 'subtitles='input.srt:force_style=Fontsize=24' ' output.mp4 I also tried the following, ffmpeg -i input.mp4 -vf 'subtitles='input.sr...
kamalakar b
1

votes
1

answer
80

Views

Fixing mojibakes in UTF-8 text

I have a file with text in Portuguese in UTF-8. Somehow, who produced the file selected the wrong encoding, and the text is full of mojibake: IDENTIFICAÌàÌÄO instead of identificação André instead of André Automated tools do not see anything wrong with the file. I tried to fix it with Pyth...
Strabonio
1

votes
0

answer
212

Views

Angular 5 Formdata with Ä Ö Ü as values

I want to uplade files to a server. So i append them into a formdata let formData: FormData = new FormData(); formData.append('value1', 'ÖÄÜ'); formData.append('uploadFile', file, file.name); and try to post them to the server this.http.post(`uploadURL`, formData) .subscribe( data => console.log...
FunkRehkitz
1

votes
0

answer
43

Views

Encode/decode a string for it to pass umlauts in filenames?

I have a script that takes the argument g. After that a relative filename is expected. So I have a https://server.com/index.php?g=test/test.php. I know what you say, but this is not about security. Now I want to use this script from within the server. I am german so my files sometimes have umlauts....
Alex
1

votes
0

answer
98

Views

Why are the same special characters e with accent showing differently on two of my websites?

I have a wordpress site (using Salient theme) and a shop site. Both are using the same fonts. On the wordpress site is a text with a e-acute (é) on it. This is showing good but on my shop the same character is showing differently (it seems bolder and smaller in both Chrome and Firefox). I have trie...
Wouter T
1

votes
0

answer
710

Views

Asp.Net Core 2.0 MVC app: return response in non-utf8 character encoding

I did the following: Installed System.Text.Encoding.CodePages 4.4.0 In Startup.ConfigureServices() I have: Encoding.RegisterProvider(CodePagesEncodingProvider.Instance); In CustomersController public async Task Register() { ... return View(); } Register.cshtml content like below: @{ Layout = null; }...
synergetic
1

votes
1

answer
53

Views

Sed insert variable containing UTF8 chars

I am trying to insert a html-directory structure generated by tree via sed into an existing html file. Problem: The directory structure contains html tags and non-ASCII chars: Wiki ├── Servers │   ├── Plesk-Umzug.html │   └── web02.html ├── SmartHome │   └──...
Max
1

votes
1

answer
27

Views

escape a string vs utf_8 string difference [closed]

I have a String as: [MK II Hatchback][Citroën][1.6 VTI 120][2009] Now I want to escape this string into UTF_8. Is this the right way to do it? And what is this special character ë? String myString = '[MK II Hatchback][Citroën][1.6 VTI 120][2009]'; String value = new String(myString.getBytes(UTF_8...
flash
1

votes
0

answer
91

Views

Python - Decode Text (German) from Hexadecimal Value and write it to a file

I know there are a lot of encoding/decoding topics here and I tried it for hours, but I'm still not able to solve it. Hence I want to raise a question: I have a string with hexadecimal values, which in the end is a text that I want to write to a text file with the correct encoding. hexvalues = '4765...
Ginsor
1

votes
0

answer
232

Views

Loading of file having non-printable UTF-8 characters to apache-hive table showing junk characters/boxes � �

When I'm trying to load a file to Hive, some of the characters are not getting interpreted properly and are coming up as boxes in hive. Using the 'file' command, I get the file type as below: ksh> file -bi data.TXT text/plain; charset=us-ascii Below is a data sample (opened in vim editor in a Putty...
Balkrishan Aggarwal
1

votes
0

answer
58

Views

Unable to load data from jitterbit to salesforce

We are trying to load data from mysql database to salesforce through Jitterbit(v: 8.16.13.1) on windows 2012 R2 machine. We have created mysql view using Jitterbit cloud data loader to load the view data to salesforce objects. We have UTf8 special characters for eg: .Ã,®,Ø,ÿ in the database fie...
Prajay Verenkar
1

votes
0

answer
158

Views

C# HttpListenerRequest QueryString UTF-8 decoding issue

I've written a http service using HttpListener. The service receives GET requests, with a single query string variable req that holds all parameters in a json-encoded string, then performs some tasks. The requests are generated from other servers and query strings are url-encoded using UTF-8. The pr...
Alberto Pastore
1

votes
0

answer
193

Views

UnicodeDecodeError Python/Django application

I'm getting this error UnicodeDecodeError at /select_text 'utf-8' codec can't decode byte 0xe7 in position 92: invalid continuation byte Request Method: POST Request URL: http://agata.pgie.ufrgs.br/select_text Django Version: 2.0.1 Exception Type: UnicodeDecodeError Exception Value: 'utf-8'...
alvarosps
1

votes
1

answer
88

Views

Invalid Salt on Local versus Heroku (Encoding UTF-8 problems)

The Issue: It appears there is some sort of encoding/decoding issue going on with flask-bcrypt for my Mac. I'd like to know if there is an easy solution to fix this so I can run my local with a similar setup to my Heroku version. Comparison: Local If I use .decode('utf-8') with generate_password...
dizzy
1

votes
1

answer
269

Views

Selenium Python web scraping UTF-8

Maybe this question was asked before but since I could not find a proper answer, I dare to ask a similar one. My problem is, I have been trying to scrape a Turkish car sale web site which is named 'Sahibinden'. I use jupyter notebook and sublime editors.Once I try to get the data written in a csv fi...
Mike
1

votes
0

answer
99

Views

ExifInterface UTF-8 support

I am trying to write TAMIL characters in Exif tags, it writes without any error/warning, try{ ExifInterface exifInterface = new ExifInterface(someFile.getPath()); String text='ENGLISHதமிழ்'; exifInterface.setAttribute(ExifInterface.TAG_IMAGE_DESCRIPTION,text); exifInterface.saveAttribute...
Bharthikannan
1

votes
0

answer
69

Views

Cannot use rmarkdown from RGui

I would like to generate a PDF via RMarkdown from the RGui. I am able to knit the file via RStudio but it fails every time via RGui. This is 'essai.r' file that contains the render command. library(rmarkdown) library(knitr) Sys.setenv(RSTUDIO_PANDOC='C:/Program Files (x86)/Pandoc') rmarkdown::render...
user1752610
1

votes
3

answer
44

Views

How to save/load a 2d array containing unicode to/from a .txt file [Python] [encoding issues][utf8]

So the basic issue i am having is when i write to the txt file the unicode character \u2656 becomes this b'\xe2\x99\x96' (i believe this is byte code?). Then when i read the file i cannot decode it back to \u2656. Board_visual is just a 2d array where each item is either a unicode character or None...
A Newling
1

votes
0

answer
36

Views

Why is Python3 decode/encode duplicating characters?

I am trying to check the subject of emails in foreign languages for automatic tests. I was having issues with encoding so I decided to try and write something that handles the encoding of the subject. In this case it's given to me in base64. Converting this to utf-8 and then decoding it produces...
ellsworthless
1

votes
0

answer
41

Views

realm is generating � while retrieving data which is encoded in utf-8

I am using realm to store and retrieve utf-8 for nepali(Devnagari) characters but when retrieving � is also generating which is not a nepali(Devnagari) character and not stored. How to solve this issue? Along with realm I have used Gson and retrofit. Gson gson = new GsonBuilder().disableHtmlEscap...
Aaiam Litigoner
1

votes
0

answer
101

Views

R delete special characters (faster way)

I have a huge data frame, with some columns containing 'characters'. The problem is that I have some 'wrong' characters, like this: mutate_all(data, funs(tolower)) > Error in mutate_impl(.data, dots) : Evaluation error: invalid input > 'https://www.ps.f/c-w/nos-promions/v-ambght-rembment.html#mo...
R overflow

View additional questions