当前位置: 动力学知识库 > 问答 > 编程问答 >

hadoop - Create schema on hbase/hive from dynamic unstructured sources

问题描述:

We want to analyze using hive on the below type of data. Below are the challenges.

Source data are flat files from different sources.Multiple source file on daily basis.

There is no fixed columns (each files have different columns).

Each file have very large number of rows.

No:of columns,order of the column are diffrent.

each field will be comma seperated, but field value might have quotes ("").

Please suggest what would be the ideal aproch in this. Load to hbase and create hive table on top of that? or is it possible to create hive table with dynamic schema?

分享给朋友:
您可能感兴趣的文章:
随机阅读: