Serialization Formats
Cross-Language Services Framework
★ 4.0
Google's Data Interchange Format
★ 4.7
pip install thriftpip install protobufpip install thriftpip install protobufPython data engineers encounter Apache Thrift when working with systems like Apache Parquet, HBase, and Cassandra, which use Thrift internally for data serialisation and RPC. The thrift Python library enables engineers to call Thrift-based services from Python pipelines. Thrift is also used in microservice architectures where Python services need to communicate with services written in Java, Go, or C++ via a strongly-typed interface.
Python data engineers use `protobuf` (the `google.protobuf` package) to serialize and deserialize structured messages in Kafka topics and gRPC services. Proto schemas define the contract between Python producers and consumers — `protoc` generates Python classes from `.proto` files, and engineers call `.SerializeToString()` and `ParseFromString()` to encode and decode messages efficiently.
Individual Tool Pages