# Contents of /trunk/projects/grid/gms/doc/GMS.tex

Revision 5735 - (show annotations)
Sat Feb 22 01:08:11 2020 UTC (5 months, 1 week ago) by major.brian
File MIME type: application/x-tex
File size: 26544 byte(s)
CDP reasoning in GMS

 1 \documentclass[11pt,a4paper]{ivoa} 2 \input tthdefs 3 4 \title{Group Membership Service} 5 6 \ivoagroup{Grid and Web Services} 7 8 \author{Brian Major} 9 \author{Patrick Dowler} 10 \author{Giuliano Taffoni} 11 \author{Adrian Damian} 12 \author{Sara Bertocco} 13 \author{Marco Molinaro} 14 15 \editor{Brian Major} 16 17 % \previousversion[????URL????]{????Funny Label????} 18 \previousversion{This is the first public release} 19 20 \begin{document} 21 \begin{abstract} 22 23 The Group Membership Service (GMS) specification describes a service interface for determining whether a user is a member of a group. Membership information can be used to protect access to proprietary resources. When an authorization decision is needed (whether to grant or deny access to a proprietary resource), a call to GMS can be made to see if the requesting user is a member of the group assigned to protect the resource in question. Examples of proprietary resources are wide ranging but include: observation data and metadata and scarce or limited services and infrastructure. Because this specification details how a single group can protect multiple, potentially distributed, resources, it allows for the representation of teams with common authorization rights. The members of such teams can span multiple organizations but can be managed within a single service. In this way, GMS offers an interoperable, flexible, and scalable mechanism for sharing proprietary assets with a potential dynamic set of team members. 24 25 \end{abstract} 26 27 \section*{Acknowledgments} 28 For the creation of this document we acknowledge the support of the Canadian Space Agency, the National Research Council Canada, the Italian National Institue of Astrophysics, and the Astronomy ESFRI and Research Infrastructure Cluster (ASTERICS). ASTERICS is a project supported by the European Commission Framework Programme Horizon 2020 Research and Innovation action under grant agreement n. 653477. 29 30 \section*{Conformance-related definitions} 31 32 The words MUST'', SHALL'', SHOULD'', MAY'', RECOMMENDED'', and 33 OPTIONAL'' (in upper or lower case) used in this document are to be 34 interpreted as described in IETF standard RFC2119 \citep{std:RFC2119}. 35 36 The \emph{Virtual Observatory (VO)} is a 37 general term for a collection of federated resources that can be used 38 to conduct astronomical research, education, and outreach. 39 The \href{http://www.ivoa.net}{International 40 Virtual Observatory Alliance (IVOA)} is a global 41 collaboration of separately funded projects to develop standards and 42 infrastructure that enable VO applications. 43 44 45 \section{Introduction} 46 47 Through standard IVOA protocols, many astronomy data centres and institutes offer users access to datasets through Datalink (Datalink \citep{2015ivoa.spec.0617D}) and SODA (Server-side Operations for Data Access \citep{2017ivoa.spec.0517B}, etc), metadata and catalogs through TAP (\citep{2010ivoa.spec.0327D}), and storage through VOSpace (\citep{2018ivoa.spec.0621G}). There are also many instances of custom access services to astronomy resources. In certain cases the information within these services is proprietary -- it is only allowed to be accessed by certain individuals. Due to the wide variety and inherently institute-specific set of rules that may define how the information is proprietary, it is beneficial to the owners and maintainers of the rules to have a standard way of describing who has access to what resources. Additionally, the rules describing resource access may be determined by an entity external to the holder of these resources. To these ends, this document sets out a standard, programmatic, and interoperable method of determining whether a given user is allowed to access a given proprietary resource. 48 49 The ideas presented by GMS enable data centres to do authorization checks in an interoperable fashion. In the context of authorization, interoperability can viewed on two levels: interoperability amongst the cooperating services \emph{within} a data centre, and interoperability \emph{between} data centres. Because of the orthogonal nature of authorization, these amount to the same problem. 50 51 Interoperability aside, GMS describes a simple, general, maintainable, and scalable approach to performing authorization, and so is a recommended architectural pattern for managing access to proprietary resources. 52 53 \subsection{Proprietary resources} 54 \label{subsec:propresources} 55 56 Most facilities have a period of time in which only the Principal Investigator and the associated team has access to observation metadata and data files. Even without a proprietary period, time is required to verify and validate observations before they can be made public. 57 58 Programs create higher level products such as catalogs and images. On many occasions, they must be accessible to only those who are authorized. 59 60 Many organizations have internal, custom services with private or sensitive data. Similarily, organizations are using TAP to to access relational data that serves only internal or operational purposes and should not be exposed to the general public. 61 62 In these ways and others, proprietary resources exist. For these to be made available to those with authorization, we require a consistent way of performing authorization checks. 63 64 \subsection{Role within the VO Architecture} 65 \label{subsec:roleinarch} 66 67 \begin{figure} 68 \centering 69 70 \includegraphics[width=0.9\textwidth]{role_diagram.pdf} 71 \caption{GMS in relation to the IVOA architecture} 72 \label{fig:archdiag} 73 \end{figure} 74 75 Fig.~\ref{fig:archdiag} shows the role this document plays within the 76 IVOA architecture \citep{note:VOARCH}. 77 78 GMS can be used by any software that needs to check, for authorization purposes, whether a user is a member of group. Because of this general purpose functionality, GMS slices through all IVOA standards and lies in the middle of the SHARING technical resource in the IVOA architecture diagram. Indeed, GMS allows for the sharing of proprietary resources to a limited audience. 79 80 \subsection{Use Cases} 81 \label{subsec:usecases} 82 83 Aside from the main use case of restricting access to proprietary resources, GMS supports a number of other use cases, of both the user and system variety. 84 85 \paragraph{Proprietary information} Restricting access to proprietary resources to certain users. 86 87 \paragraph{Homogeneity} Using the same mechanism to control access to proprietary resources in a data centre or in multiple data centres. 88 89 \paragraph{Scalability} A distributed mechanism that scales linearly with the resources being protected. 90 91 \paragraph{Remotely managing access} A project may wish to control access to resources that reside externally. 92 93 \paragraph{Access rule sharing} A project may consist of a variety of resources that can be all managed by the same access rule: group membership certification. 94 95 \paragraph{Extending the services of a data centre} A project that has hosted data and metadata at a data centre may wish to create value-added services outside of the data centre itself. If some of the data or metadata is proprietary, the extended services may need to determine if a user is allowed to perform certain action on that data or metadata. A remote GMS instance can be used in this way. 96 97 \paragraph{Cooperating institutes} Two or more institutes may work together on a single project that involves proprietary resources and would benefit from a single, standard mechanism for protecting those resources. 98 99 \subsection{Definitions} 100 101 \paragraph{Authentication} User identification through credentials or identity provider. Refer to the IVOA Single-Sign-On Profile: Authentication Mechanisms. \citep{2017ivoa.spec.0524T} 102 103 \paragraph{Authorization} Making the decision of whether to grant a user permission to a given resource. The decision usually involves knowing the user's identity obtained through authentication. 104 105 \paragraph{GMS Authorization} Making the decision of whether to grant a user permission to a given resource by ensuring the authenticated user is a member of the group assigned to protect that resource. 106 107 \paragraph{Resource} Something that may require authorization for access. For example, a service, a data file, metadata, a catalog, or infrastructure. 108 109 \paragraph{User} An individual or entity identified by authentication that is attempting to perform some action. 110 111 \paragraph{Group} A set of users. 112 113 \paragraph{Grant} Authorizing access to a proprietary resource by assigning a group to protect that resource. 114 115 \paragraph{Revoke} Removing access to a proprietary resource by removing an assigned group. 116 117 \paragraph{Owner} A user or group of users who may grant or revoke access to a specific resource. 118 119 \section{Authorization Requirements} 120 121 When looking at a system that has proprietary resources that need to be protected, it is clear that there are two distinct phases to authorization: the assignment of the rules protecting the resources, and the attempts by various users to gain access to those resources. They are described here: 122 123 \begin{enumerate} 124 \item The owner(s) of a resource may, at any time, change the rules by which a resource may be accessed. This is the \emph{granting and revoking of access}. 125 \item When users try to access resources, the granting rules for that resource are evaluated at runtime. This is the \emph{authorization check}. 126 \end{enumerate} 127 128 With these phases in mind and with the use cases defined, we can state that the goals of authorization are: 129 130 \begin{itemize} 131 \item To allow for restricted access to certain resources: only a certain set of individuals may access certain resources. 132 \item To allow certain individuals to set the access rules on resources. The owner(s) of the resources need to manage the access rules. 133 \item To be able to re-use granting rules between resources. Projects must authorize access to a variety of proprietary resources. 134 \item To be able manage granting rules at a single location. Projects should not have to update each resource on a change to a re-used grant. 135 \item To be able to reference remote granting rules. Proprietary resources should not be confined to a single institution. 136 \end{itemize} 137 138 \section{Groups} 139 140 \subsection{Why Groups?} 141 \label{subsec:whygroups} 142 143 Why are groups a good model for authorization? When a system needs to perform an \emph{authorization check} on a resource, it is trying to determine if the authenticated user is allowed access. There are a number of options on how this can be accomplished. 144 145 A simple approach would be to add the identity of the user to the resource. However, this is too restrictive as there may be multiple users who are allowed access. So, we could instead add a list of user identities to the resource being protected. It becomes a problem when there are two resources that need protecting by the same set of individuals. This becomes difficult to maintain because a change in access rules (\emph{granting and revoking access}) would mean a change to multiple resources. 146 147 So, it becomes clear that this list of users needs to be decoupled from the resource so that it can be referenced and shared by multiple resources. To do so, the list must become a single entity than can be referenced by a name. And so, we must have a named group of users. 148 149 A central repository of groups of users would introduce other problems: a single point of failure, and the inability to partition groups of users. Thus, the \emph{location} of the group must accompany the group reference so that it is possible to have multiple collections of groups of users and multiple associated GMS services. 150 151 Resources must then reference a group by a URI with a location and a name that is unique within that location. This is called the Group Identifier. 152 153 Systems must use the information in the group identifier to determine if the user is a member of the group. Because the location may be outside of the immediate vicinity of the resource, this query must be performed in a standard and accessible manner and so is defined as a RESTful interface to group membership. 154 155 \subsection{Group Identifiers} 156 \label{subsec:groupids} 157 158 A \emph{group identifier} is used to uniquely and universally identify individual groups. They are attached to proprietary resources for the purpose of referencing the group (or groups) whose members are authorized to access that resource. When a system needs to do an authorization check because a request for access is being made, it can make the decision based on the response of a membership call to a GMS service. With the help of an IVOA Registry, the system has all the information it needs within the group identifier to locate the associated GMS service and formulate the REST call to that service for the membership check. 159 160 Group identifiers are IVOA Identifiers (IVOIDs) \citep{2016ivoa.spec.0523D}. This means they can be used to look up the underlying GMS service in an IVOA registry (as is explained in the IVOA Identifiers document). Group names are specified in the \emph{query} part of the IVOID and are mandatory in group identifiers. So, group identifiers must conform to all the rules of IVOIDs and also MUST include the \emph{query} part of an IVOID for the group name. 161 162 Below is an example of a valid and typical group identifier: 163 164 \begin{verbatim} 165 ivo://authority.example.com/groupService?mygroup 166 \end{verbatim} 167 168 There are two ways to resolve the associated GMS service URL: lookup the document associated with \emph{ivo://authority.example.com/groupService} in the registry; or, issue a RegTAP (\citep{2014ivoa.spec.1208D}) query for relevant elements of that document. Here we explain the RegTAP approach. Following the recommendations in that specification, the query would be done with three necessary constraints in the where clause: 169 \begin{itemize} 170 \item{ivoid} - The \emph{registry part} of the group identifier. 171 \item{standard\_id} - The desired search feature of GMS. 172 \item{intf\_role} - Always '\emph{std}', to indicate a standard service is being queried for. 173 \end{itemize} 174 175 and one optional constraint, \emph{security\_method\_id}, used to identify how clients can authenticate to the GMS service. To find a GMS service that does not require authorization, the value of the security\_method\_id constraint would be NULL. However, since anonymously accessible GMS services are not likely to exist (see section \ref{subsec:infopriv}), the query should either: 176 177 \begin{itemize} 178 \item include a desired security\_method\_id in the where clause, as specified by the IVOA Single-Sign-On Profile \citep{2017ivoa.spec.0524T}, or; 179 \item omit the constraint and iterate over the resulting rows to choose an appropriate security method. 180 \end{itemize} 181 182 The following query will return a row for each access\_url and security\_method\_id combination. The ivoid value is calculated by removing the query string from the group identifier. Since we are looking to perform an \emph{is member} call, we ask for the GMS search capability, identified by the GMS search standardID (see section \ref{subsec:api}). 183 184 \begin{verbatim} 185 SELECT access_url, security_method_id 186 FROM rr.interface 187 NATURAL JOIN rr.capability 188 NATURAL JOIN rr.resource 189 WHERE 190 ivoid = 'ivo://authority.example.com/groupService' 191 AND standard_id = 'ivo://ivoa.net/std/gms#search-1.0' 192 AND intf_role='std' 193 \end{verbatim} 194 195 Note to authors: The use of security\_method\_id is undergoing changes in RegTAP 1.1 (possible removal). This document should be updated accordingly upon its acceptance. 196 197 This would result in one or more access URLs capable of supporting a GMS search on the group 'groupName' with its corresponding security method support. For example, it could return three rows with values: 198 199 \vspace{3mm} 200 \hskip-1.0cm 201 \begin{tabular}{l l} 202 \textbf{access\_url} & \textbf{security\_method\_id} \\ 203 \hline 204 https://server.example.com/gms1/search & ivo://ivoa.net/sso\#tls-with-password \\ 205 https://server.example.com/gms2/search & ivo://ivoa.net/sso\#tls-with-certificate \\ 206 https://server.example.com/gms2/search & ivo://ivoa.net/sso\#cookie \\ 207 \hline 208 \end{tabular} 209 \vspace{3mm} 210 211 The first row identifies a URL to the GMS search capability supporting username and password authentication. The second and third rows show a URL that supports both client certificate authentication and cookie authentication. This also implies that membership information about the group 'mygroup' is available from either access URL. 212 213 To then perform the group membership query on any of these URLs, the service would formulate a REST call as defined by the GMS Search API. 214 215 \section{GMS Search API} 216 217 \subsection{API Definition} 218 \label{subsec:api} 219 220 The Group Membership Service defines a RESTful API \citep{fielding00} that allows for the determination of whether a user is a member of a group. This is the GMS search capability and is identified by the following standard ID: 221 222 \begin{verbatim} 223 ivo://ivoa.net/std/gms#search-1.0 224 \end{verbatim} 225 226 Within this capability, there are two functions and associated endpoints as described in the table below. 227 228 \vspace{3mm} 229 \begin{tabular}{l l} 230 \textbf{Function} & \textbf{Endpoint} \\ 231 \hline 232 boolean isMember(Group, User) & GET /search/\{group\} \\ 233 list getMemberships(User) & GET /search \\ 234 \hline 235 \end{tabular} 236 \vspace{3mm} 237 238 Where \emph{search} represents the \xmlel{access\_url} from the RegTAP call and \emph{\{group\}} is the groupName part of a Group Identifier. 239 240 For a successful HTTP GET to \xmlel{/search/\{group\}}, the service shall respond with HTTP 200 (OK). If the user is a member of \xmlel{\{group\}}, the service must repeat the name of the group (ending with a newline as a CRLF\footnote{Carriage Return character (ASCII 13) plus a Line Feed character (ASCII 10)}) in the response body in text/plain format. If the user is not a member of the group, or if the user is not recognized, or if the group is not recognized, the service must return an empty response body. 241 242 A successful HTTP GET to \xmlel{/search} shall return HTTP 200 (OK) with a list of the groupNames in which the user is a member in the response body. The response must again have a Content-Type of \emph {text/plain}. Each group (even the last) must end with a CRLF. If the user is not a member of any groups, or if the user is not recognized, the response body must be empty. 243 244 It is the authenticated user (the user making the REST call) who is the subject of the of the membership question. This user's identity is determined by one of the authentication mechanisms described in the IVOA Single-Sign-On Profile. If the user cannot be identified from the call because they have not authenticated, the service must respond with HTTP 400 (Bad Request). 245 246 For an unsuccessful HTTP GET to \xmlel{/search/\{group\}} or \xmlel{/search}, the service must respond with the appropriate HTTP response code. Some non-200 response codes and the reason for their response are: 247 248 \begin{itemize} 249 \item{400} - If an authenticated user could not be identitfied from the HTTP request. The response message should indicate that the user is unknown to the system. 250 \item{403} - If the calling user has not authenticated. The response message should indicate that the user must authenticate to use the service. 251 \item{500} - A service operation failure. 252 \end{itemize} 253 254 \subsection {Search Examples} 255 \label{subsec:examples} 256 257 \paragraph{Example 1 - Group access to a VOSpace Node} 258 259 A user is trying to download a VOSpace file that has the group-read property set to 260 261 \begin{verbatim} 262 ivo://authority.example.com/gms/instance1?my-collaboration 263 \end{verbatim} 264 265 This resolves (though a RegTAP query for the search API) to URL 266 267 \begin{verbatim} 268 https://server.example.com/groupService/search 269 \end{verbatim} 270 271 To authorize the user, the VOSpace service queries the GMS search service using the user's delegated credentials 272 273 \begin{verbatim} 274 HTTP GET to https://server.example.com/gmsService/search/ 275 my-collaboration 276 \end{verbatim} 277 278 The GMS service identifies the user, consults its group membership information, and returns a response code of 200 with the string \emph{my-collaboration} (followed by CRLF) written to the response body when confirming the user is a member of group 'my-collaboration'. 279 280 \begin{verbatim} 281 my-collaboration 282 \end{verbatim} 283 284 \paragraph{Example 2 - Group access to table data} 285 286 A user issues an ADQL query to a table with row-level authorization in a TAP service. A read-group column defines which group is allowed to read that row. The first row that is encountered with a non-null read-group has value: 287 288 \begin{verbatim} 289 ivo://authority.example.com/gms/instance1?my-other-collaboration 290 \end{verbatim} 291 292 In anticipation of more rows to follow, and to avoid needing to make multiple calls to GMS, the TAP service asks for all the user's groups when the first protected row is encountered. This cached group information can be applied to all subsequent rows processed. In this example, the service does not have the user's delegated credentials so passes the user information (as parameters) in the search call. 293 294 \begin{verbatim} 295 HTTP GET to 296 https://server.example.com/gmsService/search 297 \end{verbatim} 298 299 The GMS service returns HTTP 200 and all the groups in which user 'myusername' is a member: 300 301 \begin{verbatim} 302 my-collaboration 303 my-other-collaboration 304 my-final-collaboration 305 \end{verbatim} 306 307 The TAP service caches this group membership information for the lifetime of the request so that it can be used if necessary when checking other rows. If a read-group entry with a different \emph{registry part} of the group identifier is encountered, the TAP service must call that GMS service too and add the list of groups to its cache. 308 309 \subsection {GMS and Credential Delegation} 310 \label{subsec:creddel} 311 312 User and group membership information may be considered private, so determining who is allowed to make GMS search calls is an important consideration. This is part of the reason why the specification only allows for group membership checks to be made by the user whose membership is being checked (the 'target user'). This rule ensures that only the target user can see their group membership information. 313 314 This rule also means that the caller of GMS must use the credentials (proxy certificate, token, etc..) of the target user. Although users may themselves call GMS for membership information it is generally not very useful in their hands. The target use case is for programmatic systems to call GMS for authorization checks. So, those systems must have access to the target user's credentials. This is accomplished through use of the IVOA Credential Delegation Protocol \citep{2010ivoa.spec.0218P}. 315 316 An alternative to this approach, which was thoroughly considered, is to also allow the use of privileged credentials to make GMS calls. That is: allow the set of systems that need to do authorization checks to make GMS calls for any target user. There are a number of problems with this. 317 318 New systems that need to do authorization checks require authorization to make those calls so would have to be added to a set of rules. This is a maintenance step that can be avoided. When there is a chain of privileged service calls that need to be made, the complexity of mapping and maintaining those rules increases quickly. This complexity is compounded when needing to interoperate with external GMS instances. 319 320 Another problem with making external GMS calls with privileged accounts is the need for a trust arrangement between the hosts. Organizations would need to ask each other to allow GMS calls to work for certain privileged accounts. 321 322 When service calls are always made by the originating user, then services such as GMS only have to concern themselves with the caller of the service, so the complexity and potential for error is much reduced. Information privacy is easily controlled when only the user may see membership information. 323 324 \section {Implementation} 325 326 \subsection {Implementation Options} 327 \label{subsec:implopts} 328 329 An implementation of GMS requires a system that associates users with zero or more groups, some options include: 330 331 \begin{itemize} 332 \item Via Grouper (groups in MySQL, users in LDAP) \footnote{https://www.internet2.edu/products-services/trust-identity/grouper/} 333 \item By using LDAP only with group membership plugins. 334 \item Through a relational database. 335 \item A VOSpace implementation: It is conceivable that VOSpace could be used to implement GMS, where ContainerNodes represent groups and DataNodes represent users. 336 \end{itemize} 337 338 \subsection{Groups of Groups} 339 \label{subsec:groupsofgroups} 340 341 It may be functionally attractive to support groups within groups. If this is implemented, then the service must ensure that this representation is reflected by the service API. For example, if an isMember(g) call is made, and the group 'g' is a group within another group in which the user is a member, then the service must return true. The fact that the service supports groups within groups is not exposed through the search API, but the API does not prohibit such an implementation. 342 343 If one of the contained groups exists at another GMS instance, perhaps outside of the organization, then the service may transitively query that service to determine group membership, taking care to avoid a loop caused by groups being members of each other. 344 345 \appendix 346 347 \section{Changes from Previous Versions} 348 \label{sec:changehistory} 349 350 \subsection{Changes from WD-GMS-1.0-20190506} 351 \begin{itemize} 352 \item{General text changes for clarification in abstract and sections 1, 3} 353 \item{Removed support for identifying the 'target user' of a GMS call with id parameters. The 'target user' is now always the user making the API call to GMS.} 354 \item{Added new sub-section: GMS and Credential Delegation} 355 \end{itemize} 356 357 \subsection{Changes from WD-GMS-1.0-20190329} 358 \begin{itemize} 359 \item{Reverted Group Identifier to be an IVOID} 360 \item{Corrected, expanded, and clarified the group identifier registry resolution procedure} 361 \item{Updated bibliography references} 362 \end{itemize} 363 364 \subsection{Changes from WD-GMS-1.0-20181025} 365 \begin{itemize} 366 \item{Changed Group identifier URI to be in the format gms://authority/path?group} 367 \item{Changed names of params user and principal to identity and identityType} 368 \item{Corrected API definition to always return 200 on succcess} 369 \item{REST API now described in a table} 370 \end{itemize} 371 372 % these would be subsections "Changes from v. WD-..." 373 % Use itemize environments. 374 375 \bibliography{ivoatex/ivoabib,ivoatex/docrepo} 376 377 \end{document}