This question about LDAP: Asked
LDAP re-encode a utf-8 string
--
YangCheng - 07 Sep 2016
I happend to found that all my chinese words downloaded from my LDAP server is wrongly encoded to utf-8 even if it's already utf-8
Here is example:
1.When i get an entry from LDAP using linux command "ldapsearch",
dn='CN=chengyang [\xc3\xa7\xc2\xa8\xc2\x8b\xc3\xa6\xc2\xb4\xc2\x8b],OU=Employee,OU=Mioffice,DC=mioffice,DC=cn'
It's already a utf-8 string.
2. But when i get the same entry from LDAP contrib logs, i got
dn='CN=chengyang [\xc3\xa7\xc2\xa8\xc2\x8b\xc3\xa6\xc2\xb4\xc2\x8b],OU=Employee,OU=Mioffice,DC=mioffice,DC=cn'
You will see it's wrongly encoded again as utf-8.
And i found, before cacheGroupFromEntry is called, the string is already like that, is it NET::LDAP's bug?
But when i just write a test.pm to query using NET::LDAP, everything is alright
--
YangCheng - 07 Sep 2016
use Net::LDAP;
$ldap = Net::LDAP->new('********') or die "$@";
$mesg = $ldap->bind('***************', password => '********') ; # an anonymous bind
$mesg = $ldap->search( # perform a search
base => "OU=Mioffice,DC=mioffice,DC=cn",
filter => "sAMAccountName=chengyang"
);
$mesg->code && die $mesg->error;
foreach $entry ($mesg->entries) {
# $entry->dump;
$dn = $entry->dn();
print STDOUT "dn is $dn\n";
$dn =~ s/(.)/sprintf("0x%x",ord($1))/eg;
print STDOUT "dn hex is $dn\n";
}
It's really wierd, Above codes, if i execute in the perl script just using command, i will get correct code.
But if i run in
LdapContrib.pm, the apache error log will show
[Thu Sep 08 10:38:17.676292 2016] [cgi:error] [pid 49315] [client 10.237.130.70:49742] AH01215: cytest dn is CN=chengyang [\xc3\xa7\xc2\xa8\xc2\x8b\xc3\xa6\xc2\xb4\xc2\x8b],OU=Employee,OU=Mioffice,DC=mioffice,DC=cn, referer:
http://wiki.bsp.xiaomi.com/Main/WikiGroups
Why? This is a really damn bug
--
YangCheng - 08 Sep 2016
the encoding is already wrong with NET::LDAP itself, but is affected by wiki Plugin environment
--
YangCheng - 08 Sep 2016