hive - Extra backslash in Sqoop import result -
currently i'm using sqoop import data hp vertica database hive, column special character, result different data in vertica db, here code:
sqoop import --driver com.vertica.jdbc.driver --connect jdbc:vertica://db.foo.com/corp \ --username xx --p --where 'src_sys_cd=xxx' --null-string '\\n' --null-non-string '\\n' \ --m 1 --fields-terminated-by '\001' --hive-drop-import-delims --table addr \ --target-dir /xxxx/addr
data in vertica db:
src_sys_cd ctry_cd addr_id addr_typ_cd addr_str_1_lg_nm 123456 nz 107560 null c\ - 108 waiatarua road
data showed in hive db:
src_sys_cd ctry_cd addr_id addr_typ_cd addr_str_1_lg_nm 123456 nz 107560 null c\\ - 108 waiatarua road
the difference in column addr_str_1_lg_nm, after sqoop importing, 1 backslash(\) added. while other column not have backslash (\) not changed.
since there null in vertica, must use --null-string '\\n' --null-non-string '\\n'.
i've tried other options like:
--escaped-by \\ --optionally-enclosed-by '\"'
but doesn't work.
for dbs sqoop supports direct connect, use --direct , remove --hive-drop-import-delims import data as-is.
this link lists db sqoop supported
while i've confirmed vertica supported direct connect not listed.