Java Code Examples for org.apache.arrow.vector.types.pojo.Field#getType()

The following examples show how to use org.apache.arrow.vector.types.pojo.Field#getType() . You can vote up the ones you like or vote down the ones you don't like, and go to the original project or source file by following the links above each example. You may check out the related API usage on the sidebar.
Example 1
Source File: JdbcSplitQueryBuilder.java    From aws-athena-query-federation with Apache License 2.0 6 votes vote down vote up
private List<String> toConjuncts(List<Field> columns, Constraints constraints, List<TypeAndValue> accumulator, Map<String, String> partitionSplit)
{
    List<String> conjuncts = new ArrayList<>();
    for (Field column : columns) {
        if (partitionSplit.containsKey(column.getName())) {
            continue; // Ignore constraints on partition name as RDBMS does not contain these as columns. Presto will filter these values.
        }
        ArrowType type = column.getType();
        if (constraints.getSummary() != null && !constraints.getSummary().isEmpty()) {
            ValueSet valueSet = constraints.getSummary().get(column.getName());
            if (valueSet != null) {
                conjuncts.add(toPredicate(column.getName(), valueSet, type, accumulator));
            }
        }
    }
    return conjuncts;
}
 
Example 2
Source File: ElasticsearchSchemaUtils.java    From aws-athena-query-federation with Apache License 2.0 5 votes vote down vote up
/**
 * Checks that two Schema objects are equal using the following criteria:
 * 1) The Schemas must have the same number of fields.
 * 2) The corresponding fields in the two Schema objects must also be the same irrespective of ordering within
 *    the Schema object using the following criteria:
 *    a) The fields' names must match.
 *    b) The fields' Arrow types must match.
 *    c) The fields' children lists (used for complex fields, e.g. LIST and STRUCT) must match irrespective of
 *       field ordering within the lists.
 *    d) The fields' metadata maps must match. Currently that's only applicable for scaled_float data types that
 *       use the field's metadata map to store the scaling factor associated with the data type.
 * @param mapping1 is a mapping to be compared.
 * @param mapping2 is a mapping to be compared.
 * @return true if the lists are equal, false otherwise.
 */
@VisibleForTesting
protected static final boolean mappingsEqual(Schema mapping1, Schema mapping2)
{
    logger.info("mappingsEqual - Enter - Mapping1: {}, Mapping2: {}", mapping1, mapping2);

    // Schemas must have the same number of elements.
    if (mapping1.getFields().size() != mapping2.getFields().size()) {
        logger.warn("Mappings are different sizes - Mapping1: {}, Mapping2: {}",
                mapping1.getFields().size(), mapping2.getFields().size());
        return false;
    }

    // Mappings must have the same fields (irrespective of internal ordering).
    for (Field field1 : mapping1.getFields()) {
        Field field2 = mapping2.findField(field1.getName());
        // Corresponding fields must have the same Arrow types or the Schemas are deemed not equal.
        if (field2 == null || field1.getType() != field2.getType()) {
            logger.warn("Fields' types do not match - Field1: {}, Field2: {}",
                    field1.getType(), field2 == null ? "null" : field2.getType());
            return false;
        }
        logger.info("Field1 Name: {}, Field1 Type: {}, Field1 Metadata: {}",
                field1.getName(), field1.getType(), field1.getMetadata());
        logger.info("Field2 Name: {}, Field2 Type: {}, Field2 Metadata: {}",
                field2.getName(), field2.getType(), field2.getMetadata());
        // The corresponding fields' children and metadata maps must also match or the Schemas are deemed not equal.
        if (!childrenEqual(field1.getChildren(), field2.getChildren()) ||
                !field1.getMetadata().equals(field2.getMetadata())) {
            return false;
        }
    }

    return true;
}
 
Example 3
Source File: ElasticsearchSchemaUtils.java    From aws-athena-query-federation with Apache License 2.0 5 votes vote down vote up
/**
 * Checks that two lists of Field objects (corresponding to the children lists of two corresponding fields in
 * two different Schema objects) are the same irrespective of ordering within the lists using the following
 * criteria:
 *    1) The lists of Field objects must be the same size.
 *    2) The corresponding fields' names must match.
 *    3) The corresponding fields' Arrow types must match.
 *    4) The corresponding fields' children lists (used for complex fields, e.g. LIST and STRUCT) must match
 *       irrespective of field ordering within the lists.
 *    5) The corresponding fields' metadata maps must match. Currently that's only applicable for scaled_float
 *       data types that use the field's metadata map to store the scaling factor associated with the data type.
 * @param list1 is a list of children fields to be compared.
 * @param list2 is a list of children fields to be compared.
 * @return true if the lists are equal, false otherwise.
 */
private static final boolean childrenEqual(List<Field> list1, List<Field> list2)
{
    logger.info("childrenEqual - Enter - Children1: {}, Children2: {}", list1, list2);

    // Children lists must have the same number of elements.
    if (list1.size() != list2.size()) {
        logger.warn("Children lists are different sizes - List1: {}, List2: {}", list1.size(), list2.size());
        return false;
    }

    Map<String, Field> fields = new LinkedHashMap<>();
    list2.forEach(value -> fields.put(value.getName(), value));

    // lists must have the same Fields (irrespective of internal ordering).
    for (Field field1 : list1) {
        // Corresponding fields must have the same Arrow types or the Schemas are deemed not equal.
        Field field2 = fields.get(field1.getName());
        if (field2 == null || field1.getType() != field2.getType()) {
            logger.warn("Fields' types do not match - Field1: {}, Field2: {}",
                    field1.getType(), field2 == null ? "null" : field2.getType());
            return false;
        }
        logger.info("Field1 Name: {}, Field1 Type: {}, Field1 Metadata: {}",
                field1.getName(), field1.getType(), field1.getMetadata());
        logger.info("Field2 Name: {}, Field2 Type: {}, Field2 Metadata: {}",
                field2.getName(), field2.getType(), field2.getMetadata());
        // The corresponding fields' children and metadata maps must also match or the Schemas are deemed not equal.
        if (!childrenEqual(field1.getChildren(), field2.getChildren()) ||
                !field1.getMetadata().equals(field2.getMetadata())) {
            return false;
        }
    }

    return true;
}
 
Example 4
Source File: ValueConverter.java    From aws-athena-query-federation with Apache License 2.0 5 votes vote down vote up
/**
 * Allows for coercing types in the event that schema has evolved or there were other data issues.
 * @param field The Apache Arrow field that the value belongs to.
 * @param origVal The original value from Redis (before any conversion or coercion).
 * @return The coerced value.
 */
public static Object convert(Field field, String origVal)
{
    if (origVal == null) {
        return origVal;
    }

    ArrowType arrowType = field.getType();
    Types.MinorType minorType = Types.getMinorTypeForArrowType(arrowType);

    switch (minorType) {
        case VARCHAR:
            return origVal;
        case INT:
        case SMALLINT:
        case TINYINT:
            return Integer.valueOf(origVal);
        case BIGINT:
            return Long.valueOf(origVal);
        case FLOAT8:
            return Double.valueOf(origVal);
        case FLOAT4:
            return Float.valueOf(origVal);
        case BIT:
            return Boolean.valueOf(origVal);
        case VARBINARY:
            try {
                return origVal.getBytes("UTF-8");
            }
            catch (UnsupportedEncodingException ex) {
                throw new RuntimeException(ex);
            }
        default:
            throw new RuntimeException("Unsupported type conversation " + minorType + " field: " + field.getName());
    }
}
 
Example 5
Source File: FlattenPrel.java    From dremio-oss with Apache License 2.0 5 votes vote down vote up
@Override
public Field visit(ArrowType.List type) {
  if(field.getName().equals(column.getAsUnescapedPath())){
    Field child = field.getChildren().get(0);
    return new Field(field.getName(), child.isNullable(), child.getType(), child.getChildren());
  }
  return field;
}
 
Example 6
Source File: ParquetRowiseReader.java    From dremio-oss with Apache License 2.0 5 votes vote down vote up
private void verifyDecimalTypesAreSame(OutputMutator output, ParquetColumnResolver columnResolver) {
  for (ValueVector vector : output.getVectors()) {
    Field fieldInSchema = vector.getField();
    if (fieldInSchema.getType().getTypeID() == ArrowType.ArrowTypeID.Decimal) {
      ArrowType.Decimal typeInTable = (ArrowType.Decimal) fieldInSchema.getType();
      Type typeInParquet = null;
      // the field in arrow schema may not be present in hive schema
      try {
        typeInParquet  = schema.getType(columnResolver.getParquetColumnName(fieldInSchema.getName()));
      } catch (InvalidRecordException e) {
      }
      if (typeInParquet == null) {
        continue;
      }
      boolean schemaMisMatch = true;
      OriginalType originalType = typeInParquet.getOriginalType();
      if (originalType.equals(OriginalType.DECIMAL) ) {
        int precision = typeInParquet
          .asPrimitiveType().getDecimalMetadata().getPrecision();
        int scale = typeInParquet.asPrimitiveType().getDecimalMetadata().getScale();
        ArrowType decimalType = new ArrowType.Decimal(precision, scale);
        if (decimalType.equals(typeInTable)) {
          schemaMisMatch = false;
        }
      }
      if (schemaMisMatch) {
        throw UserException.schemaChangeError().message("Mixed types "+ fieldInSchema.getType()
          + " , " + typeInParquet + " is not supported.")
          .build(logger);
      }
    }
  }
}
 
Example 7
Source File: ArrowSchemaUtilTest.java    From iceberg with Apache License 2.0 4 votes vote down vote up
private void validate(Type iceberg, Field field, boolean optional) {
  ArrowType arrowType = field.getType();
  Assert.assertEquals(optional, field.isNullable());
  switch (iceberg.typeId()) {
    case BOOLEAN:
      Assert.assertEquals(BOOLEAN_FIELD, field.getName());
      Assert.assertEquals(ArrowType.ArrowTypeID.Bool, arrowType.getTypeID());
      break;
    case INTEGER:
      Assert.assertEquals(INTEGER_FIELD, field.getName());
      Assert.assertEquals(ArrowType.ArrowTypeID.Int, arrowType.getTypeID());
      break;
    case LONG:
      Assert.assertEquals(LONG_FIELD, field.getName());
      Assert.assertEquals(ArrowType.ArrowTypeID.Int, arrowType.getTypeID());
      break;
    case FLOAT:
      Assert.assertEquals(FLOAT_FIELD, field.getName());
      Assert.assertEquals(ArrowType.ArrowTypeID.FloatingPoint, arrowType.getTypeID());
      break;
    case DOUBLE:
      Assert.assertEquals(DOUBLE_FIELD, field.getName());
      Assert.assertEquals(ArrowType.ArrowTypeID.FloatingPoint, arrowType.getTypeID());
      break;
    case DATE:
      Assert.assertEquals(DATE_FIELD, field.getName());
      Assert.assertEquals(ArrowType.ArrowTypeID.Date, arrowType.getTypeID());
      break;
    case TIME:
      Assert.assertEquals(TIME_FIELD, field.getName());
      Assert.assertEquals(ArrowType.ArrowTypeID.Time, arrowType.getTypeID());
      break;
    case TIMESTAMP:
      Assert.assertEquals(TIMESTAMP_FIELD, field.getName());
      Assert.assertEquals(ArrowType.ArrowTypeID.Timestamp, arrowType.getTypeID());
      break;
    case STRING:
      Assert.assertEquals(STRING_FIELD, field.getName());
      Assert.assertEquals(ArrowType.ArrowTypeID.Utf8, arrowType.getTypeID());
      break;
    case FIXED:
      Assert.assertEquals(FIXED_WIDTH_BINARY_FIELD, field.getName());
      Assert.assertEquals(ArrowType.Binary.TYPE_TYPE, arrowType.getTypeID());
      break;
    case BINARY:
      Assert.assertEquals(BINARY_FIELD, field.getName());
      Assert.assertEquals(ArrowType.Binary.TYPE_TYPE, arrowType.getTypeID());
      break;
    case DECIMAL:
      Assert.assertEquals(DECIMAL_FIELD, field.getName());
      Assert.assertEquals(ArrowType.Decimal.TYPE_TYPE, arrowType.getTypeID());
      break;
    case STRUCT:
      Assert.assertEquals(STRUCT_FIELD, field.getName());
      Assert.assertEquals(ArrowType.Struct.TYPE_TYPE, arrowType.getTypeID());
      break;
    case LIST:
      Assert.assertEquals(LIST_FIELD, field.getName());
      Assert.assertEquals(ArrowType.List.TYPE_TYPE, arrowType.getTypeID());
      break;
    case MAP:
      Assert.assertEquals(MAP_FIELD, field.getName());
      Assert.assertEquals(ArrowType.ArrowTypeID.Map, arrowType.getTypeID());
      break;
    default:
      throw new UnsupportedOperationException("Check not implemented for type: " + iceberg);
  }
}
 
Example 8
Source File: FieldIdUtil2.java    From dremio-oss with Apache License 2.0 4 votes vote down vote up
private static TypedFieldId getFieldIdIfMatches(
    final Field field,
    final TypedFieldId.Builder builder,
    boolean addToBreadCrumb,
    final PathSegment seg) {
  if (seg == null) {
    if (addToBreadCrumb) {
      builder.intermediateType(CompleteType.fromField(field));
    }
    return builder.finalType(CompleteType.fromField(field)).build();
  }

  final ArrowTypeID typeType = field.getType().getTypeID();

  if (seg.isArray()) {
    if (seg.isLastPath()) {
      CompleteType type;
      if (typeType == ArrowTypeID.Struct) {
        type = CompleteType.fromField(field);
      } else if (typeType == ArrowTypeID.List) {
        type = CompleteType.fromField(field.getChildren().get(0));
        builder.listVector();
      } else {
        throw new UnsupportedOperationException("FieldIdUtil does not support field of type " + field.getType());
      }
      builder //
              .withIndex() //
              .finalType(type);

      // remainder starts with the 1st array segment in BasePath.
      // only set remainder when it's the only array segment.
      if (addToBreadCrumb) {
        addToBreadCrumb = false;
        builder.remainder(seg);
      }
      return builder.build();
    } else {
      if (addToBreadCrumb) {
        addToBreadCrumb = false;
        builder.remainder(seg);
      }
    }
  } else {
    if (typeType == ArrowTypeID.List) {
      return null;
    }
  }

  final Field inner;
  if (typeType == ArrowTypeID.Struct) {
    if(seg.isArray()){
      return null;
    }
    FieldWithOrdinal ford = getChildField(field, seg.isArray() ? null : seg.getNameSegment().getPath());
    if (ford == null) {
      return null;
    }
    inner = ford.field;
    if (addToBreadCrumb) {
      builder.intermediateType(CompleteType.fromField(inner));
      builder.addId(ford.ordinal);
    }
  } else if (typeType == ArrowTypeID.List) {
    inner = field.getChildren().get(0);
  } else {
    throw new UnsupportedOperationException("FieldIdUtil does not support field of type " + field.getType());
  }

  final ArrowTypeID innerTypeType = inner.getType().getTypeID();
  if (innerTypeType == ArrowTypeID.List || innerTypeType == ArrowTypeID.Struct) {
    return getFieldIdIfMatches(inner, builder, addToBreadCrumb, seg.getChild());
  } else if (innerTypeType == ArrowTypeID.Union) {
    return getFieldIdIfMatchesUnion(inner, builder, addToBreadCrumb, seg.getChild());
  } else {
    if (seg.isNamed()) {
      if(addToBreadCrumb) {
        builder.intermediateType(CompleteType.fromField(inner));
      }
      builder.finalType(CompleteType.fromField(inner));
    } else {
      builder.finalType(CompleteType.fromField(inner));
    }

    if (seg.isLastPath()) {
      return builder.build();
    } else {
      PathSegment child = seg.getChild();
      if (child.isLastPath() && child.isArray()) {
        if (addToBreadCrumb) {
          builder.remainder(child);
        }
        builder.finalType(CompleteType.fromField(inner));
        return builder.build();
      } else {
        logger.warn("You tried to request a complex type inside a scalar object or path or type is wrong.");
        return null;
      }
    }
  }
}
 
Example 9
Source File: BasicTypeHelper.java    From dremio-oss with Apache License 2.0 4 votes vote down vote up
public static FieldVector getNewVector(Field field, BufferAllocator allocator, CallBack callBack) {
  if (field.getType() instanceof ObjectType) {
    return new ObjectVector(field.getName(), allocator);
  }

  MinorType type = org.apache.arrow.vector.types.Types.getMinorTypeForArrowType(field.getType());

  List<Field> children = field.getChildren();

  switch (type) {

  case UNION:
    UnionVector unionVector = new UnionVector(field.getName(), allocator, callBack);
    if (!children.isEmpty()) {
      unionVector.initializeChildrenFromFields(children);
    }
    return unionVector;
  case LIST:
    ListVector listVector = new ListVector(field.getName(), allocator, callBack);
    if (!children.isEmpty()) {
      listVector.initializeChildrenFromFields(children);
    }
    return listVector;
  case STRUCT:
    StructVector structVector = new StructVector(field.getName(), allocator, callBack);
    if (!children.isEmpty()) {
      structVector.initializeChildrenFromFields(children);
    }
    return structVector;

  case NULL:
    return new ZeroVector();
  case TINYINT:
    return new TinyIntVector(field, allocator);
  case UINT1:
    return new UInt1Vector(field, allocator);
  case UINT2:
    return new UInt2Vector(field, allocator);
  case SMALLINT:
    return new SmallIntVector(field, allocator);
  case INT:
    return new IntVector(field, allocator);
  case UINT4:
    return new UInt4Vector(field, allocator);
  case FLOAT4:
    return new Float4Vector(field, allocator);
  case INTERVALYEAR:
    return new IntervalYearVector(field, allocator);
  case TIMEMILLI:
    return new TimeMilliVector(field, allocator);
  case BIGINT:
    return new BigIntVector(field, allocator);
  case UINT8:
    return new UInt8Vector(field, allocator);
  case FLOAT8:
    return new Float8Vector(field, allocator);
  case DATEMILLI:
    return new DateMilliVector(field, allocator);
  case TIMESTAMPMILLI:
    return new TimeStampMilliVector(field, allocator);
  case INTERVALDAY:
    return new IntervalDayVector(field, allocator);
  case DECIMAL:
    return new DecimalVector(field, allocator);
  case FIXEDSIZEBINARY:
    return new FixedSizeBinaryVector(field.getName(), allocator, WIDTH_ESTIMATE);
  case VARBINARY:
    return new VarBinaryVector(field, allocator);
  case VARCHAR:
    return new VarCharVector(field, allocator);
  case BIT:
    return new BitVector(field, allocator);
  default:
    break;
  }
  // All ValueVector types have been handled.
  throw new UnsupportedOperationException(buildErrorMessage("get new vector", type));
}
 
Example 10
Source File: CompleteType.java    From dremio-oss with Apache License 2.0 4 votes vote down vote up
public static CompleteType fromField(Field field){
    // IGNORE this until the NullableMapVector.getField() returns a nullable type.
//    Preconditions.checkArgument(field.isNullable(), "Dremio only supports nullable types.");
    return new CompleteType(field.getType(), field.getChildren());
  }
 
Example 11
Source File: MajorTypeHelper.java    From dremio-oss with Apache License 2.0 4 votes vote down vote up
public static MajorType getMajorTypeForField(Field field) {
  if (field.getType() instanceof ObjectType) {
    return Types.required(TypeProtos.MinorType.GENERIC_OBJECT);
  }
  return getMajorTypeForArrowType(field.getType(), field.getChildren());
}
 
Example 12
Source File: BatchSchemaField.java    From dremio-oss with Apache License 2.0 4 votes vote down vote up
public static BatchSchemaField fromField(Field field) {
  List<Field> children = field.getChildren().stream().map(
    BatchSchemaField::fromField).collect(Collectors.toList());

  return new BatchSchemaField(field.getName(), field.isNullable(), field.getType(), children);
}