Java Code Examples for org.apache.parquet.schema.MessageType#getFieldCount()

The following examples show how to use org.apache.parquet.schema.MessageType#getFieldCount() . You can vote up the ones you like or vote down the ones you don't like, and go to the original project or source file by following the links above each example. You may check out the related API usage on the sidebar.
Example 1
Source File: ParquetFileAccessor.java    From pxf with Apache License 2.0 6 votes vote down vote up
/**
 * Builds a map of names to Types from the original schema, the map allows
 * easy access from a given column name to the schema {@link Type}.
 *
 * @param originalSchema the original schema of the parquet file
 * @return a map of field names to types
 */
private Map<String, Type> getOriginalFieldsMap(MessageType originalSchema) {
    Map<String, Type> originalFields = new HashMap<>(originalSchema.getFieldCount() * 2);

    // We need to add the original name and lower cased name to
    // the map to support mixed case where in GPDB the column name
    // was created with quotes i.e "mIxEd CaSe". When quotes are not
    // used to create a table in GPDB, the name of the column will
    // always come in lower-case
    originalSchema.getFields().forEach(t -> {
        String columnName = t.getName();
        originalFields.put(columnName, t);
        originalFields.put(columnName.toLowerCase(), t);
    });

    return originalFields;
}
 
Example 2
Source File: ParquetBaseTest.java    From pxf with Apache License 2.0 6 votes vote down vote up
private Map<String, Type> getOriginalFieldsMap(MessageType originalSchema) {
    Map<String, Type> originalFields = new HashMap<>(originalSchema.getFieldCount() * 2);

    // We need to add the original name and lower cased name to
    // the map to support mixed case where in GPDB the column name
    // was created with quotes i.e "mIxEd CaSe". When quotes are not
    // used to create a table in GPDB, the name of the column will
    // always come in lower-case
    originalSchema.getFields().forEach(t -> {
        String columnName = t.getName();
        originalFields.put(columnName, t);
        originalFields.put(columnName.toLowerCase(), t);
    });

    return originalFields;
}
 
Example 3
Source File: SchemaIntersection.java    From parquet-mr with Apache License 2.0 6 votes vote down vote up
public SchemaIntersection(MessageType fileSchema, Fields requestedFields) {
  if(requestedFields == Fields.UNKNOWN)
    requestedFields = Fields.ALL;

  Fields newFields = Fields.NONE;
  List<Type> newSchemaFields = new ArrayList<Type>();
  int schemaSize = fileSchema.getFieldCount();

  for (int i = 0; i < schemaSize; i++) {
    Type type = fileSchema.getType(i);
    Fields name = new Fields(type.getName());

    if(requestedFields.contains(name)) {
      newFields = newFields.append(name);
      newSchemaFields.add(type);
    }
  }

  this.sourceFields = newFields;
  this.requestedSchema = new MessageType(fileSchema.getName(), newSchemaFields);
}
 
Example 4
Source File: ParquetPageSourceFactory.java    From presto with Apache License 2.0 5 votes vote down vote up
private static org.apache.parquet.schema.Type getParquetType(HiveColumnHandle column, MessageType messageType, boolean useParquetColumnNames)
{
    if (useParquetColumnNames) {
        return getParquetTypeByName(column.getBaseColumnName(), messageType);
    }

    if (column.getBaseHiveColumnIndex() < messageType.getFieldCount()) {
        return messageType.getType(column.getBaseHiveColumnIndex());
    }
    return null;
}