我有一个如下所示的架构。我如何解析嵌套对象
root
|-- apps: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- appName: string (nullable = true)
| | |-- appPackage: string (nullable = true)
| | |-- Ratings: array (nullable = true)
| | | |-- element: struct (containsNull = true)
| | | | |-- date: string (nullable = true)
| | | | |-- rating: long (nullable = true)
|-- id: string (nullable = true)
最佳答案
假设您读入一个 json 文件并打印您向我们展示的架构,如下所示:
DataFrame df = sqlContext.read().json("/path/to/file").toDF();
df.registerTempTable("df");
df.printSchema();
然后您可以像这样选择结构类型中的嵌套对象...
DataFrame app = df.select("app");
app.registerTempTable("app");
app.printSchema();
app.show();
DataFrame appName = app.select("element.appName");
appName.registerTempTable("appName");
appName.printSchema();
appName.show();
https://stackoverflow.com/questions/29948789/