Welcome Guest ( Log In | Register )

 
Reply to this topicStart new topic
> Abnormal behavior of api response

 
post Aug 28 2022, 06:23
Post #1
Sachia Lanlus



Regular Poster
*****
Group: Gold Star Club
Posts: 716
Joined: 29-March 15
Level 500 (Ponyslayer)


I have built a customized eh reader for my android tablet.
And I use API to get the tag information of the gallery.
I just found that it crashed every time I browse for a specific gallery.
This gallery:
https://e-hentai.org/g/2310902/4b65bf6480/
When my application make a request to eh api endpoint.
It always responds with empty string.
This is the cause of the crash.

I have no idea why.
Maybe the name of the uploader?

Here is the python code to reproduce the error:
```
import requests

request_json = {'method': 'gdata', 'gidlist': [[2310902, '4b65bf6480']], 'namespace': 1}

s = requests.Session()
r = s.post(url='https://api.e-hentai.org/api.php', json=request_json)
print(r.text)
```
User is offlineProfile CardPM
Go to the top of the page
+Quote Post

 
post Aug 28 2022, 14:26
Post #2
uareader



Critter
*********
Group: Catgirl Camarilla
Posts: 5,592
Joined: 1-September 14
Level 500 (Ponyslayer)


Yeah, uploader name is suspicious.
I think my browser can read europen/america, japanese, chinese, korean, russian and others...if someone still manage to get some � in his name, is he really not trying to break stuff?
User is offlineProfile CardPM
Go to the top of the page
+Quote Post

 
post Aug 28 2022, 15:08
Post #3
Sachia Lanlus



Regular Poster
*****
Group: Gold Star Club
Posts: 716
Joined: 29-March 15
Level 500 (Ponyslayer)


QUOTE(uareader @ Aug 28 2022, 20:26) *

Yeah, uploader name is suspicious.
I think my browser can read europen/america, japanese, chinese, korean, russian and others...if someone still manage to get some � in his name, is he really not trying to break stuff?


Agree
It is so weird to see unknown characters in modern browsers with unicode support.
Maybe we need 10bro to take a look at the log of API server.
User is offlineProfile CardPM
Go to the top of the page
+Quote Post

 
post Aug 28 2022, 18:58
Post #4
atasitian



Casual Poster
****
Group: Members
Posts: 422
Joined: 18-July 10
Level 322 (Godslayer)


From what I can see, the gallery page is declared as being UTF-8, but the uploader name clearly isn't valid UTF-8. The raw bytes I see are D2 BB C6 B7 D5 AC C4 D0. The first three pairs of bytes can be decoded as UTF-8 characters corresponding to U+04BB CYRILLIC SMALL LETTER SHHA, U+01B7 LATIN CAPITAL LETTER EZH, and U+056C ARMENIAN SMALL LETTER LIWN, giving "һƷլ". But the last two bytes are not valid UTF-8 (they both look like start bytes without a continuation byte).

I'm not familiar with Chinese, but those bytes in the uploader name look suspiciously like Big5 encoded text. I think they would correspond to "珨ⅲ晙鹹", but I have no idea what (if anything) that means.

TL;DR: I think the user wrote their name using the Big5 encoding of Chinese text, but e-h is trying to interpret it as UTF-8, which breaks in various ways when displayed or processed.
User is offlineProfile CardPM
Go to the top of the page
+Quote Post

 
post Aug 28 2022, 21:24
Post #5
Tenboro

Admin




Yeah, looks like the JSON encoder is choking on the username. I've fixed the uploader's name, so it should work now.
User is offlineProfile CardPM
Go to the top of the page
+Quote Post

 
post Aug 29 2022, 02:52
Post #6
mewsf



Regular Poster
*****
Group: Gold Star Club
Posts: 565
Joined: 24-June 14
Level 500 (Ponyslayer)


QUOTE(atasitian @ Aug 29 2022, 00:58) *

From what I can see, the gallery page is declared as being UTF-8, but the uploader name clearly isn't valid UTF-8. The raw bytes I see are D2 BB C6 B7 D5 AC C4 D0. The first three pairs of bytes can be decoded as UTF-8 characters corresponding to U+04BB CYRILLIC SMALL LETTER SHHA, U+01B7 LATIN CAPITAL LETTER EZH, and U+056C ARMENIAN SMALL LETTER LIWN, giving "һƷլ". But the last two bytes are not valid UTF-8 (they both look like start bytes without a continuation byte).

I'm not familiar with Chinese, but those bytes in the uploader name look suspiciously like Big5 encoded text. I think they would correspond to "珨ⅲ晙鹹", but I have no idea what (if anything) that means.

TL;DR: I think the user wrote their name using the Big5 encoding of Chinese text, but e-h is trying to interpret it as UTF-8, which breaks in various ways when displayed or processed.


BIG5 encodes Traditional Chinese character, and "珨ⅲ晙鹹" don't seem to have a valid meaning. The uploader should be using Simplified Chinese, so it's more likely to be encoded in GBK/GB2312/GB18030 that can be decoded as "一品宅", which at least looks reasonable. But it's better to ask the user for his original thought. Although I doubt if he can see the forum post/PMs.
User is offlineProfile CardPM
Go to the top of the page
+Quote Post

 
post Aug 29 2022, 09:19
Post #7
cs987987



笑看牆國人礦小粉紅
*********
Group: Gold Star Club
Posts: 7,152
Joined: 11-March 12
Level 500 (Ponyslayer)


the ID is "一品宅男" with "GBK" or "GB18030"
User is offlineProfile CardPM
Go to the top of the page
+Quote Post

 
post Aug 29 2022, 22:02
Post #8
Tenboro

Admin




QUOTE(cs987987 @ Aug 29 2022, 09:19) *
the ID is "一品宅男" with "GBK" or "GB18030"


Eh, not sure if they care, but fixed anyway.
User is offlineProfile CardPM
Go to the top of the page
+Quote Post


Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 


Lo-Fi Version Time is now: 28th April 2025 - 17:00