# Contents of /trunk/projects/grid/gms/doc/GMS.tex

Revision 5731 - (show annotations)
Tue Feb 18 00:46:18 2020 UTC (9 months, 2 weeks ago) by major.brian
File MIME type: application/x-tex
File size: 26479 byte(s)
Rephrasing of GMS WD abstract

 1 \documentclass[11pt,a4paper]{ivoa} 2 \input tthdefs 3 4 \title{Group Membership Service} 5 6 \ivoagroup{Grid and Web Services} 7 8 \author{Brian Major} 9 \author{Patrick Dowler} 10 \author{Giuliano Taffoni} 11 \author{Adrian Damian} 12 \author{Sara Bertocco} 13 \author{Marco Molinaro} 14 15 \editor{Brian Major} 16 17 % \previousversion[????URL????]{????Funny Label????} 18 \previousversion{This is the first public release} 19 20 \begin{document} 21 \begin{abstract} 22 23 The Group Membership Service (GMS) specification describes a service interface for determining whether a user is a member of a group. Membership information can be used to protect access to proprietary resources. When an authorization decision is needed (whether to grant or deny access to a proprietary resource), a call to GMS can be made to see if the requesting user is a member of the group assigned to protect the resource in question. Examples of proprietary resources are wide ranging but include: observation data and metadata and scarce or limited services and infrastructure. Because this specification details how a single group can protect multiple, potentially distributed, resources, it allows for the representation of teams with common authorization rights. The members of such teams can span multiple organizations but can be managed within a single service. In this way, GMS offers an interoperable, flexible, and scalable mechanism for sharing proprietary assets with a potential dynamic set of team members. 24 25 \end{abstract} 26 27 \section*{Acknowledgments} 28 For the creation of this document we acknowledge the support of the Canadian Space Agency, the National Research Council Canada, the Italian National Institue of Astrophysics, and the Astronomy ESFRI and Research Infrastructure Cluster (ASTERICS). ASTERICS is a project supported by the European Commission Framework Programme Horizon 2020 Research and Innovation action under grant agreement n. 653477. 29 30 \section*{Conformance-related definitions} 31 32 The words MUST'', SHALL'', SHOULD'', MAY'', RECOMMENDED'', and 33 OPTIONAL'' (in upper or lower case) used in this document are to be 34 interpreted as described in IETF standard RFC2119 \citep{std:RFC2119}. 35 36 The \emph{Virtual Observatory (VO)} is a 37 general term for a collection of federated resources that can be used 38 to conduct astronomical research, education, and outreach. 39 The \href{http://www.ivoa.net}{International 40 Virtual Observatory Alliance (IVOA)} is a global 41 collaboration of separately funded projects to develop standards and 42 infrastructure that enable VO applications. 43 44 45 \section{Introduction} 46 47 Through standard IVOA protocols, many astronomy data centres and institutes offer users access to datasets (Datalink \citep{2015ivoa.spec.0617D}, Server-side Operations for Data Access \citep{2017ivoa.spec.0517B}, etc), metadata (TAP \citep{2010ivoa.spec.0327D}), and storage (VOSpace \citep{2018ivoa.spec.0621G}). In some cases this information is proprietary -- it is only allowed to be accessed by certain individuals. Due to the wide variety and inherently institute-specific set of rules that may define how the information is proprietary, it is beneficial to the owners and maintainers of the rules to have a standard way of describing who has access to what resources. Additionally, the rules describing resource access may be determined by an entity external to the holder of these resources. To these ends, this document sets out a standard, programmatic, and interoperable method of determining whether a given user is allowed to access a given resource. 48 49 The ideas presented by GMS enable data centres to do authorization checks in an interoperable fashion. In the context of authorization, interoperability can viewed on two levels: interoperability amongst the cooperating services \emph{within} a data centre, and interoperability \emph{between} data centres. Because of the orthogonal nature of authorization, these amount to the same problem. 50 51 Interoperability aside, GMS describes a simple, general, maintainable, and scalable approach to performing authorization, and so is a recommended architectural pattern for managing access to proprietary resources. 52 53 \subsection{Proprietary resources} 54 55 Most facilities have a period of time in which only the Principal Investigator's team has access to observation metadata and data files. Even without a proprietary period, time is required to verify and validate observations before they can be made public. 56 57 Projects also frequently create higher level products such as catalogs and images. On many occasions, they must be accessible to only those who are authorized. 58 59 Proprietary information exists. For it to be made available in a data centre to those with authorization, a way of performing that authorization check is required. 60 61 \subsection{Role within the VO Architecture} 62 63 \begin{figure} 64 \centering 65 66 \includegraphics[width=0.9\textwidth]{role_diagram.pdf} 67 \caption{GMS in relation to the IVOA architecture} 68 \label{fig:archdiag} 69 \end{figure} 70 71 Fig.~\ref{fig:archdiag} shows the role this document plays within the 72 IVOA architecture \citep{note:VOARCH}. 73 74 GMS can be used by any software that needs to check, for authorization purposes, whether a user is a member of group. Because of this general use case, GMS cuts through all of the IVOA and lies squarely in the middle of the SHARING technical resource in the IVOA architecture diagram. 75 76 \subsection{Use Cases} 77 78 Aside from the main use case of restricting access to proprietary resources, GMS supports a number of other use cases, of both the user and system variety. 79 80 \paragraph{Proprietary information} Restricting access to proprietary resources to certain users. 81 82 \paragraph{Homogeneity} Using the same mechanism to control access to proprietary resources in a data centre or in multiple data centres. 83 84 \paragraph{Scalability} A distributed mechanism that scales linearly with the resources being protected. 85 86 \paragraph{Remotely managing access} A project may wish to control access to resources that reside externally. 87 88 \paragraph{Access rule sharing} A project may consist of a variety of resources that can be all managed by the same access rules. 89 90 \paragraph{Extending the services of a data centre} A project that has hosted data and metadata at a data centre may wish to create value-added services outside of the data centre itself. If some of the data or metadata is proprietary, the extended services may need to determine if a user is allowed to perform certain action on that data or metadata. 91 92 \paragraph{Cooperating institutes} Two or more institutes may work together on a single project that involves proprietary resources so require a common mechanism for protecting those resources. 93 94 \subsection{Definitions} 95 96 \paragraph{Authentication} User identification through credentials or identity provider. See IVOA Single-Sign-On Profile: Authentication Mechanisms. \citep{2017ivoa.spec.0524T} 97 98 \paragraph{Authorization} Making the decision of whether to grant a user permission to a given resource. The decision can involve knowing the user's identity. 99 100 \paragraph{Resource} Something that may require authorization for access. For example, a service, a data file, metadata. 101 102 \paragraph{User} An individual identified by authentication. 103 104 \paragraph{Group} A set of users. 105 106 \paragraph{Grant} Authorizing access to a protected resource by assigning a group. 107 108 \paragraph{Revoke} Removing access to a protected resource by removing an assigned group. 109 110 \paragraph{Owner} A user or group of users who may grant or revoke access to a specific resource. 111 112 \section{Authorization Requirements} 113 114 When looking at a system that has proprietary resources that need to be protected, it is clear that there are two distinct phases to authorization: the assignment of the rules protecting the resources, and the attempts by various users to gain access to those resources. They are described here: 115 116 \begin{enumerate} 117 \item The owner(s) of a resource may, at any time, change the rules by which a resource may be accessed. This is the \emph{granting and revoking of access}. 118 \item When users try to access resources, the granting rules for that resource are evaluated at runtime. This is the \emph{authorization check}. 119 \end{enumerate} 120 121 With these phases in mind and with the use cases defined, we can state that the goals of authorization are: 122 123 \begin{itemize} 124 \item To allow for restricted access to certain resources: only a certain set of individuals may access certain resources. 125 \item To allow certain individuals to set the access rules on resources. The owner(s) of the resources need to manage the access rules. 126 \item To be able to re-use granting rules between resources. Projects must authorize access to a variety of proprietary resources. 127 \item To be able manage granting rules at a single location. Projects should not have to update each resource on a change to a re-used grant. 128 \item To be able to reference remote granting rules. Proprietary resources should not be confined to a single institution. 129 \end{itemize} 130 131 \section{Groups} 132 133 \subsection{Why Groups?} 134 135 Why are groups a good model for authorization? When a system needs to perform an \emph{authorization check} on a resource, it is trying to determine if the authenticated user is allowed access. There are a number of options on how this can be accomplished. 136 137 A simple approach would be to add the identity of the user to the resource. However, this is too restrictive as there may be multiple users who are allowed access. So, we could instead add a list of user identities to the resource being protected. It becomes a problem when there are two resources that need protecting by the same set of individuals. This becomes difficult to maintain because a change in access rules (\emph{granting and revoking access}) would mean a change to multiple resources. 138 139 So, it becomes clear that this list of users needs to be decoupled from the resource so that it can be referenced and shared by multiple resources. To do so, the list must become a single entity than can be referenced by a name. And so, we must now have a named group of users. 140 141 A central repository of groups of users would introduce other problems: a single point of failure, and the inability to partition groups of users. Thus, the \emph{location} of the group must accompany the group reference so that it is possible to have multiple collections of groups of users and multiple associated GMS services. 142 143 Resources must then reference a group by a URI with a location and a name that is unique within that location. This is called the Group Identifier. 144 145 Systems must use the information in the group identifier to query location to determine if the user is a member of the group. Because the location may be outside of the immediate vicinity of the resource, this query must be performed in a standard and accessible manner and so is defined as a RESTful interface to group membership. 146 147 \subsection{Group Identifiers} 148 149 A \emph{group identifier} is used to uniquely and universally identify individual groups. They are attached to proprietary resources for the purpose of referencing the group (or groups) whose members are authorized to access that resource. When a system needs to do an authorization check because a request for access is being made, it can make the decision based on the response of a membership call to a GMS service. With the help of an IVOA Registry, the system has all the information it needs within the group identifier to locate the associated GMS service and formulate the REST call to that service for the membership check. 150 151 Group identifiers are IVOA Identifiers (IVOIDs) \citep{2016ivoa.spec.0523D}. This means they can be used to look up the underlying GMS service in an IVOA registry (as is explained in the IVOA Identifiers document). Group names are specified in the \emph{query} part of the IVOID and are mandatory in group identifiers. So, group identifiers must conform to all the rules of IVOIDs and also MUST include the \emph{query} part of an IVOID for the group name. 152 153 Below is an example of a valid and typical group identifier: 154 155 \begin{verbatim} 156 ivo://authority.example.com/groupService?mygroup 157 \end{verbatim} 158 159 There are two ways to resolve the associated GMS service URL: lookup the document associated with \emph{ivo://authority.example.com/groupService} in the registry; or, issue a RegTAP query for relevant elements of that document. Here we explain how this would be done with a RegTAP query \citep{2014ivoa.spec.1208D}. Following the recommendations in that specification, the query would be done with three necessary constraints in the where clause: 160 \begin{itemize} 161 \item{ivoid} - The \emph{registry part} of the group identifier. 162 \item{standard\_id} - The desired search feature of GMS. 163 \item{intf\_role} - Always '\emph{std}', to indicate a standard service is being queried for. 164 \end{itemize} 165 166 and one optional constraint, \emph{security\_method\_id}, used to identify how clients can authenticate to the GMS service. To find a GMS service that does not require authorization, the value of the security\_method\_id constraint would be NULL. However, since anonymously accessible GMS services are not likely to exist (see section \ref{subsec:infopriv}), the query should either: 167 168 \begin{itemize} 169 \item include a desired security\_method\_id in the where clause, as specified by the IVOA Single-Sign-On Profile \citep{2017ivoa.spec.0524T}, or; 170 \item omit the constraint and iterate over the resulting rows to choose an appropriate security method. 171 \end{itemize} 172 173 The following query will return a row for each access\_url and security\_method\_id combination. The ivoid value is calculated by removing the query string from the group identifier. Since we are looking to perform an is member call, we ask for the GMS search capability, identified by the GMS search standardID (see section \ref{subsec:api}). 174 175 \begin{verbatim} 176 SELECT access_url, security_method_id 177 FROM rr.interface 178 NATURAL JOIN rr.capability 179 NATURAL JOIN rr.resource 180 WHERE 181 ivoid = 'ivo://authority.example.com/groupService' 182 AND standard_id = 'ivo://ivoa.net/std/gms#search-1.0' 183 AND intf_role='std' 184 \end{verbatim} 185 186 Note to authors: The use of security\_method\_id is undergoing changes in RegTAP 1.1 (possible removal). This document should be updated accordingly upon its acceptance. 187 188 This would result in one or more access URLs capable of supporting a GMS search on the group 'groupName' with its corresponding security method support. For example, it could return three rows with values: 189 190 \vspace{3mm} 191 \hskip-1.0cm 192 \begin{tabular}{l l} 193 \textbf{access\_url} & \textbf{security\_method\_id} \\ 194 \hline 195 https://server.example.com/gms1/search & ivo://ivoa.net/sso\#tls-with-password \\ 196 https://server.example.com/gms2/search & ivo://ivoa.net/sso\#tls-with-certificate \\ 197 https://server.example.com/gms2/search & ivo://ivoa.net/sso\#cookie \\ 198 \hline 199 \end{tabular} 200 \vspace{3mm} 201 202 The first row identifies a URL to the GMS search capability supporting username and password authentication. The second and third rows show a URL that supports both client certificate authentication and cookie authentication. This also implies that membership information about the group 'mygroup' is available from either access URL. 203 204 To then perform the group membership query on any of these URLs, the service would formulate a REST call as defined by the GMS Search API. 205 206 \section{GMS Search API} 207 208 \subsection{API Definition} 209 \label{subsec:api} 210 211 The Group Membership Service defines a RESTful API \citep{fielding00} that allows for the determination of whether a user is a member of a group. This is the GMS search capability and is identified by the following standardID: 212 213 \begin{verbatim} 214 ivo://ivoa.net/std/gms#search-1.0 215 \end{verbatim} 216 217 Within this capability, there are two functions and associated endpoints as described in the table below. 218 219 \vspace{3mm} 220 \begin{tabular}{l l} 221 \textbf{Function} & \textbf{Endpoint} \\ 222 \hline 223 boolean isMember(Group, User) & GET /search/\{group\} \\ 224 list getMemberships(User) & GET /search \\ 225 \hline 226 \end{tabular} 227 \vspace{3mm} 228 229 Where \emph{search} represents the \xmlel{access\_url} from the RegTAP call and \emph{\{group\}} is the groupName part of a Group Identifier. 230 231 Two (optional) parameters can be supplied to identify the user: 232 233 \begin{verbatim} 234 identity= 235 identityType= 236 \end{verbatim} 237 238 For a successful HTTP GET to \xmlel{/search/\{group\}}, the service shall respond with HTTP 200 (OK). If the user is a member of \xmlel{\{group\}}, the service must repeat the name of the group (ending with a newline as a CRLF\footnote{Carriage Return character (ASCII 13) plus a Line Feed character (ASCII 10)}) in the response body in text/plain format. If the user is not a member of the group, or if the user is not recognized, or if the group is not recognized, the service must return an empty response body. 239 240 A successful HTTP GET to \xmlel{/search} shall return HTTP 200 (OK) with a list of the groupNames in which the user is a member in the response body. The response must again have a Content-Type of \emph {text/plain}. Each group (even the last) must end with a CRLF. If the user is not a member of any groups, or if the user is not recognized, the response body must be empty. 241 242 The \emph{identity} and \emph{identityType} parameters are used to identify the user who is the subject of the membership question. The \emph{identity} field is the username of the user in context of the \emph{identityType} value. For example, when the \emph{identityType} field is set to 'X.509', the \emph{identity} field will contain the user's distinguished name. Or, if the \emph{identityType} field is set to 'OAuth', the \emph{identity} field would contain the user's OAuth token. For the full list of supported identity types please refer to the User Identification standard (\textbf{Note to authors}: This is to be written). 243 244 If the \emph{identity} and \emph{identityType} parameters are not supplied, it is assumed that the user who is the subject of the membership question is the user who is making the REST call. This pattern will be in use when the call is being made by a service that supports and implements the IVOA Credential Delegation Protocol \citep{2010ivoa.spec.0218P}. If the user cannot be identified from the call because they have not authenticated, the service must respond with HTTP 400 (Bad Request). The other HTTP responses shall be the same as described above where the user was identified by the \emph{identity} and \emph{identityType} parameters. 245 246 If one of \emph{identity} or \emph{identityType} are supplied, then they both must be supplied. If only one is supplied then the service must respond with HTTP 400 (Bad Request). 247 248 For an unsuccessful HTTP GET to \xmlel{/search/\{group\}} or \xmlel{/search}, the service must respond with the appropriate HTTP response code. Some non-200 response codes and the reason for their response are: 249 250 \begin{itemize} 251 \item{400} - Bad request: A parameter name is unrecognized, or the value of a parameter is in an unrecognizable format, or only one of the two identity parameters is supplied. 252 \item{400} - Bad request: If a calling user could not be identitfied from the HTTP request or from the supplied \xmlel{identity} and \xmlel{identityType} parameters. 253 \item{500} - Internal error: A service operation failure. 254 \end{itemize} 255 256 For 400 error responses, is recommended that services include, in the response body, textual information about the problem and how clients should change the request details. 257 258 (Note to authors: It could be that the \emph{identity} and \emph{identityType} parameters are turned into one parameter that is in URI format and contains enough information to identify the user across different authentication mechanisms. For example: x509:c=ca,o=grid,ou=nrc-cnrc.gc.ca,cn=brian major) 259 260 \subsection {Search Examples} 261 262 \paragraph{Example 1 - Group access to a VOSpace Node} 263 264 A user is trying to download a VOSpace file that has the group-read property set to 265 266 \begin{verbatim} 267 ivo://authority.example.com/gms/instance1?my-collaboration 268 \end{verbatim} 269 270 This resolves (though a RegTAP query for the search API) to URL 271 272 \begin{verbatim} 273 https://server.example.com/groupService/search 274 \end{verbatim} 275 276 To authorize the user, the VOSpace service queries the GMS search service using the user's delegated credentials 277 278 \begin{verbatim} 279 HTTP GET to https://server.example.com/gmsService/search/ 280 my-collaboration 281 \end{verbatim} 282 283 The GMS service identifies the user, consults its group membership information, and returns a response code of 200 with the string \emph{my-collaboration} (followed by CRLF) written to the response body when confirming the user is a member of group 'my-collaboration'. 284 285 \begin{verbatim} 286 my-collaboration 287 \end{verbatim} 288 289 \paragraph{Example 2 - Group access to table data} 290 291 A user issues an ADQL query to a table with row-level authorization in a TAP service. A read-group column defines which group is allowed to read that row. The first row that is encountered with a non-null read-group has value: 292 293 \begin{verbatim} 294 ivo://authority.example.com/gms/instance1?my-other-collaboration 295 \end{verbatim} 296 297 In anticipation of more rows to follow, and to avoid needing to make multiple calls to GMS, the TAP service asks for all the user's groups when the first protected row is encountered. This cached group information can be applied to all subsequent rows processed. In this example, the service does not have the user's delegated credentials so passes the user information (as parameters) in the search call. 298 299 \begin{verbatim} 300 HTTP GET to 301 https://server.example.com/gmsService/search 302 ?identity=myusername&identityType=username 303 \end{verbatim} 304 305 The GMS service returns HTTP 200 and all the groups in which user 'myusername' is a member: 306 307 \begin{verbatim} 308 my-collaboration 309 my-other-collaboration 310 my-final-collaboration 311 \end{verbatim} 312 313 The TAP service caches this group membership information for the lifetime of the request so that it can be used if necessary when checking other rows. If a read-group entry with a different \emph{registry part} of the group identifier is encountered, the TAP service must call that GMS service too and add the list of groups to its cache. 314 315 \section {Implementation} 316 317 \subsection {Implementation Options} 318 319 \begin{itemize} 320 \item Via Grouper (groups in MySQL, users in LDAP) \footnote{https://www.internet2.edu/products-services/trust-identity/grouper/} 321 \item LDAP only with memberOf plugin (supports groups-of-groups) 322 \item VOSpace implementation: ContainerNodes = groups, DataNodes = users 323 \end{itemize} 324 325 \subsection{User Identity} 326 327 The concept of users and user identity is core to group authorization. When a system makes a call to a GMS service to determine if the user trying to access the resource is a member of a group, the GMS service needs to identify that user with the users in various groups. 328 329 (Author note: add reference or table of user identity types.) 330 331 The collection of data centres and astronomy institutes likely have many ways of identifying users. They could be using external identity providers, they could have a local database of users, or may have a combination of these and other approaches. This specification does not require such a design. Instead, it requires simply that users can be uniquely identified within the scope of a GMS service's realm. If a user identity reaches beyond the scope of a GMS service's realm (such as an X.509 client certificate), then it, too, may be referenced by the service. 332 333 \subsection{Information Privacy} 334 \label{subsec:infopriv} 335 336 User and group membership information may be private, so determining who is allowed to make GMS search calls must be considered when implementing a GMS service. A GMS implementation may insist that GMS search calls must be made by a certain privileged account only. This is a reasonable approach when the service is only used with a single organization, but would require the distribution of those privileged credentials to any external sites wishing to use it. 337 338 Alternatively, a GMS service could have a policy where only the user who is the subject of the membership assertions is allowed to make the GMS search calls. This approach lends itself well to external interoperability because there need not be any sharing of credentials or trust arrangements between sites -- it is always only the user who makes the service calls, even when they are transitive. This is the approach recommended in the IVOA credential delegation protocol \citep{2010ivoa.spec.0218P}. So, aside from the architectural benefits of employing this pattern, there are some information privacy concerns that are addressed. 339 340 \subsection{Groups of Groups} 341 342 It may be functionally attractive to support groups within groups. If this is implemented, then the service must ensure that this representation is reflected by the service API. For example, if an isMember(g) call is made, and the group 'g' is a group within another group in which the user is a member, then the service must return true. The fact that the service supports groups within groups is not exposed through the search API, but the API does not prohibit such an implementation. 343 344 If one of the contained groups actually exists at another GMS instance, perhaps outside of the organization, then the service must transitively query that service to determine group membership. 345 346 \appendix 347 348 \section{Changes from Previous Versions} 349 350 \subsection{Changes from WD-GMS-1.0-20190506} 351 \begin{itemize} 352 \item{Abstract rephrased} 353 \end{itemize} 354 355 \subsection{Changes from WD-GMS-1.0-20190329} 356 \begin{itemize} 357 \item{Reverted Group Identifier to be an IVOID} 358 \item{Corrected, expanded, and clarified the group identifier registry resolution procedure} 359 \item{Updated bibliography references} 360 \end{itemize} 361 362 \subsection{Changes from WD-GMS-1.0-20181025} 363 \begin{itemize} 364 \item{Changed Group identifier URI to be in the format gms://authority/path?group} 365 \item{Changed names of params user and principal to identity and identityType} 366 \item{Corrected API definition to always return 200 on succcess} 367 \item{REST API now described in a table} 368 \end{itemize} 369 370 % these would be subsections "Changes from v. WD-..." 371 % Use itemize environments. 372 373 \bibliography{ivoatex/ivoabib,ivoatex/docrepo} 374 375 \end{document}