Episode 03 -InCommon Federation -Per-Entity Metadata Distribution Service.

5 min readJan 30, 2020

Let’s talk about Per-Entity Metadata Distribution Service. But before that, did you read my blogs on InCommon Federation and metadata aggregates in InCommon? If not, to have a better understanding of the PEMD flow, I recommend you to read that blog.

In the beginning, there were few amounts of entities where metadata aggregates were small and manageable. But these aggregates expanded when more and more organizations got registered in InCommon. In early 2016, federation operators began importing metadata from international federations as part of InCommon’s participation in eduGAIN. At that time, the incommon metadata aggregate file grew in size.

With the expansion of InCommon federation customers encountered different problems using aggregates.

So the metadata consumers experienced slow start-up times and high memory usage.

Apart to that most of the consumers do not need metadata of all the entities, they use only a fraction of entities. So it consumed lots of memory for them to store unwanted metadata. Further a single error in the entity descriptor can cause an outage of the service.

These problems led to think about a new way to distribute entity metadata. InCommon introduced a new metadata service “Per-Entity Metadata Distribution Service” which is able to solve the above problems.

Per Entity Metadata Distribution Service (PEMD)

Per-entity metadata distribution service can retrieve metadata relevant to a specific entity from a metadata query service. Metadata related to an entity is stored in a specific URL, which can be uniquely identified by the entity ID in the entity descriptor tag. And also it can be used to prefetch metadata related to an entity and cache it at the start-up and refresh it on a regular basis. In aggregates, InCommon signed the whole aggregate. But with this service metadata related to each entity is signed by the Incommon. Incommon uses a separate signing key for the per-entity metadata distribution.

By now you may have a problem. How does this happen?

With PEMD service it uses a metadata query service to request the metadata related to an entity through the entityID and MDQ service sends the metadata relevant to that entityID.

Briefly, when an entity (SP /IdP) needs metadata related to an entity, a request is sent to the MDQ service with the entityID and MDQ service send back the metadata related to that entityID after executing a query.

For communication with the MDQ service, we need to use the metadata query protocol.

Metadata Query protocol

The metadata query protocol is a REST-like API for requesting and receiving arbitrary metadata. The specification related to MDQ service is broken into two parts.

Base specification

SAML profile of the base specification that focuses on SAML metadata.

You can refer those specifications to learn more about MDQ protocol.

The benefits of per-entity metadata distribution service using the MDQ protocol.

Reduced memory and resource consumption by a metadata consumer since it need only request and consume metadata for entities with which it needs to federate.
Reduced load on the metadata distribution service since consumers only query for and download the actual entity descriptors they need.
Decoupling of entity descriptors so that errors for any single entity need not impede the distribution and consumption of the metadata for all other entities.
It helps to reduce network traffic, and resolve the brittleness of large aggregates.

Risks associated with PEMD service

There are many risks associated with the MDQ service in order to support the process.

Unavailability of the metadata query service.
Poor responsiveness
Network failures
High latency

Overcome risks associated with the PEMD service

Above mentioned risks can be mitigated by implementing few capabilities in the sides of metadata clients (SPs IdPs).

A persistent caching mechanism that retains previously-retrieved metadata across software restarts so that it may be re-used if the software is restarted when the MDQ service is not available. A likely mechanism is caching to local disk and then consumption from the cache on restart.
A mechanism for pre-loading metadata for high-value IdPs and SPs and keeping it available. This enables successful operation the first time a high-value entity’s metadata is needed, even if the MDQ service is not available.
The ability to detect a failed query, retry appropriately, and after repeated completed but failed queries failover to a secondary MDQ service. A complete implementation would include the ability to mark an MDQ service as unavailable for some time but later test again and return to using it when the service is again available and completing successfully.
Likewise the ability to detect unresponsively (hanging) MDQ services or MDQ services that do not answer queries fast enough and similarly retry, mark as unavailable, and then later test for restoring into service such MDQ services.
Clients without the above capabilities can still use the per-entity metadata distribution infrastructure and interoperate with MDQ services but they risk lower availability for their users.

To increase the metadata service availability Incommon uses two metadata distribution servers, primary and secondary. If one distribution server falls the secondary server can be used to request metadata. For the sites that are extremely critical or that do not have proper internet connectivity can implement a local metadata cache.

The flow of IdP in requesting MDQ service

The flow of SP in requesting MDQ service

Migration Process

InCommon decided to move to the PEMD service. Incommon tries to migrate all the users from legacy metadata service to the per-entity metadata service.

But still, they provide old metadata endpoints to aggregates to be downloaded by the clients. Eventually, the new PEMD service will replace the old metadata service and will not update the old metadata service. So the users need to move to use per-entity metadata service by the deadline.

References

Metadata Service

A new Metadata Service is here! The InCommon Metadata Service provides a secure and trusted mean to introduce Identity…

spaces.at.internet2.edu

Metadata Distribution Service Documentation

spaces.at.internet2.edu has been upgraded to Confluence 6.15.10. If you have any questions and/or concerns, please…