XML Parsing from Protobuf in Golang: Marshaling to RFC3339

by

in

As Golang developers we are familiar with Protobuf as a way to expose data via API endpoints via GRPC, so that data can be serialized over the wire.

Being familiar with Protobuf steers us towards defining data in PROTObuf files, which can be used to transfer the data over the wire but we could also use this to store the data (protojson marshalled) to disk.

The Scenario

Imagine a customer who relies on an existing system that parses XML data. They lack the resources or time to migrate to protobuf. To accommodate their needs, we need to provide data in XML format. But, we still want to keep our data definitions in proto files, for ease of maintenance and consistency.

This post is to provide an example of marshaling a proto file with a google.protobuf.timestamp to XML, in a more readable format then the default marshalled timestamp.

Note that the default Timestamp is defined as

type Timestamp struct {
	// Represents seconds of UTC time since Unix epoch
	// 1970-01-01T00:00:00Z. Must be from 0001-01-01T00:00:00Z to
	// 9999-12-31T23:59:59Z inclusive.
	Seconds int64 `protobuf:"varint,1,opt,name=seconds,proto3" json:"seconds,omitempty"`
	// Non-negative fractions of a second at nanosecond resolution. Negative
	// second values with fractions must still have non-negative nanos values
	// that count forward in time. Must be from 0 to 999,999,999
	// inclusive.
	Nanos int32 `protobuf:"varint,2,opt,name=nanos,proto3" json:"nanos,omitempty"`
	// contains filtered or unexported fields
} 

By default, XML marshal will look similar to

<timestamp>
  <seconds>20345698</seconds>
  <nanoseconds>213243</nanoseconds>
</timestamp>

However, what we would prefer is a more human readable block such as

<timestamp>2023-10-27T10:00:00.000Z<timestamp>

Well, apart from that special TimeZone (Z) marker, we humans can at least read a date (YYYY-MM-DD) and time (HH:MM:SS) there with some space left for milliseconds.


Encoding Protobuf Messages to XML: Ensuring RFC3999 Timestamps

Protobuf (Protocol Buffers) is a powerful binary serialization format developed by Google. It’s widely used for efficient data exchange, especially in microservices architectures. However, when you need to integrate Protobuf messages with systems that expect XML, you need a way to serialize the Protobuf data into an XML format.

The Challenge: RFC3999 Timestamps

RFC3999 defines a standard format for representing timestamps in XML. This format is crucial for interoperability, as many systems expect timestamps in this specific format. Without proper encoding, your XML data might be misinterpreted, leading to errors.

Using Protobuf and Go

First letś define the proto file.

syntax = "proto3";

package example;

import "google/protobuf/timestamp.proto";

message Item {
  string name = 1;
  int32 quantity = 2;
  CustomTimeStamp created_at = 3;
}

message CustomTimeStamp {
  google.protobuf.Timestamp timestamp = 1;
}

As you can see above, we deliberately encapsulate the timestamp field with our own defined “message TimeStamp”. The XML Parsing code below will show why we do that.

package main

import (
        "encoding/xml"
        "fmt"
        "log"
        "time"

        "example" // Replace with your proto package name

        "google.golang.org/protobuf/types/known/timestamppb"
)

// CustomTimestamp wraps google.protobuf.Timestamp for easier handling in XML.
type CustomTimestamp struct {
	Timestamp *timestamppb.Timestamp `protobuf:"bytes,1,opt,name=timestamp,proto3" json:"timestamp,omitempty"` // nolint
}

// MarshalXML implements the xml.Marshaler interface for CustomTimestamp.
func (ct *CustomTimestamp) MarshalXML(e *xml.Encoder, start xml.StartElement) error {
	if ct.Timestamp == nil {
		return e.EncodeToken(start)
	}

	// Format timestamp as RFC3339
	formattedTime := ct.Timestamp.AsTime().Format(time.RFC3339)

	if formattedTime != "" {
		return e.EncodeElement(xml.CharData(formattedTime), start)
	}

	return fmt.Errorf("invalid value \"%v\" for timestamp", ct)
}

// UnmarshalXML implements the xml.Unmarshaler interface for CustomTimestamp.
func (ct *CustomTimestamp) UnmarshalXML(d *xml.Decoder, start xml.StartElement) error {
	var text string
	if err := d.DecodeElement(&text, &start); err != nil {
		return fmt.Errorf("error unmarshalling CustomTimestamp: %w", err)
	}

	parsedTime, err := time.Parse(time.RFC3339, text)
	if err != nil {
		return fmt.Errorf("invalid timestamp format: %w", err)
	}

	ct.Timestamp = timestamppb.New(parsedTime)

	return nil
}

func main() {
        item := &example.Item{
                Name:     "Widget",
                Quantity: 10,
                CreatedAt: &example.CustomTimeStamp{
                        Timestamp: timestamppb.New(time.Now()),
                },
        }

        xmlData, err := xml.MarshalIndent(item, "", "  ")
        if err != nil {
                log.Fatalf("Error marshalling to XML: %v", err)
        }

        fmt.Println(xml.Header + string(xmlData))
}

Explanation

  • We define a new protobuf message CustomTimeStamp that encapsulates the google.protobuf.Timestamp.
  • We implement the MarshalXML and UnMarshalXML methods for our generated example.CustomTimeStamp struct. This method formats the protobuf timestamp into an RFC3339 string and encodes it into the XML document using the time.RFC3999 struct.
  • The rest of the code marshals the Item message as default; t
  • The XMLMarshal method automatically detects the provided custom XML methods and uses those for marshalling to, and unmarshalling from xml to our protobuf defined data.

Output:

The code generates XML output that conforms to the RFC3999 timestamp format. The exact output will vary depending on the current time, but it will look something like this:

<Item>
    <Name>Widget</Name>  
    <CreatedAt>2023-10-27T10:00:00.000Z</CreatedAt>
    <Quantity>10</Quantity>
</Item>

Conclusion

By incorporating these best practices, you can seamlessly integrate Protobuf messages into XML-based systems while ensuring that timestamps are accurately represented according to the RFC3999 standard. This is crucial for data integrity and interoperability.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *