Apuntes cosas aprendidas de IT: de setembre 2016

After some experiences with WS saop in my job seem that people are more receptive to use Rest WS. I rather SOAP in legacy system manily because the wsdl is bounded and easy to document . Legacy systems are not flexible so you have to set all the rules beforehand. Legacy systems does not allow agile methods so the scaling out is not a something you have to have in mind. For that reason I think soap WS are more adapeted to PMI projects with traditional stages where the consitence is the key point but not avalaybilty. Nevertheless web and mobile applications are starting to change everything even in legacy systems. The amount of calls and data is doing that avalaiblity is becoming more and more important so caching and scaling out are things to really keep in mind and they are forcing us to set up solutions based in cloud and lightgweight protocols like Rest + JSON. This article is a collection of best practices and things to keep in mind to desing a Rest solution.

USE NOUNS NOT VERBS

I order to avoid a flooding of URI we should use nouns instead of verbs. If we have a entity called product easely your systems is going a need the whole CRUD operations.

\getProducts

\updateProduct

\addProduct

\findProduct

Therefore you will have at least 4 new uri'. This is a fine graned approach. Another point of view ( Coarse graned) is tak just Product and to take advantage of the http methods to distinghis them.

Technicalle the solition is \products in plural. This is useful in scenarios when you can get object with only a http get call based in Id. Thant is to say GET <uri roo /products/3452352. Of course if you need set more parameters agregated will be necesry to do a POST with a json/xm object.

TABLE of HTTP methos

S.N.	HTTP Method, URI and Operation
1	GET http://localhost:8080/UserManagement/rest/UserService/users Get list of users (Read Only)
2	GET http://localhost:8080/UserManagement/rest/UserService/users/1 Get User of Id 1 (Read Only)
3	PUT http://localhost:8080/UserManagement/rest/UserService/users/2 Insert User with Id 2 (Idempotent)
4	POST http://localhost:8080/UserManagement/rest/UserService/users/2 Update User with Id 2 (N/A)
5	DELETE http://localhost:8080/UserManagement/rest/UserService/users/1 Delete User with Id 1 (Idempotent)
6	OPTIONS http://localhost:8080/UserManagement/rest/UserService/users List the supported operations in web service (Read Only)
7	HEAD http://localhost:8080/UserManagement/rest/UserService/users Returns only HTTP Header, no Body. (Read Only)

So you might thinkg there are a biyection correspondece betwen CRUD operations and http methods. HOwever the life is not as easy and

JSON

Here some adivices to write json

Content-type application/json

Notations follow JS standard or 'CamelCase' so avoid undersocres/hyphen such as

load_produts. It is better loadProducts

Dates, numbersStandar described in ISO 8601 so dates numbers and so on has already and standart
Avoid href in json and set pagination with ofsset and size parameters if there are too many objectos to response. Of courser it only can be achieved if the WS has a sql or database store with a the possibility of opening a cursor.

RESTful security

Stateless is a goal to put your Webservice in MIcroServices or serverless platofrom so try to do sessionless services dooing autentification each reques if it is needed

Never autorizate based on URIs

Use OAuth1 or 2 or SSL now TSL.

Use API Keys instead of user pwd

For Id's avoid sequential numbers. Good solutions is a UUId's

Cloud File system.

The industry makes a classification based on the persisting and accesing. Althouthg there are hybrids systems one division you can get across is

Ephemeral Storage.

This is attached to the cyclelife of the VM. If you VM terminate then your storage will disappear. Nova in Open Satack and Amazon EC2 provide the ephemeral storage automatically. It usually is based on SSD disks.

Persistence Storage.

Persistent storage means that the storage is always avalible no matter the state of the instance or VM. This has also 3 subcategories. Object Storage, Block Storage, Shared File.

We will see some implementations below but a good article is the case of openstack with a good table to make decision. http://docs.openstack.org/ops-guide/arch-storage.html

Table. OpenStack storage
	Ephemeral storage	Block storage	Object storage	Shared File System storage
Used to…	Run operating system and scratch space	Add additional persistent storage to a virtual machine (VM)	Store data, including VM images	Add additional persistent storage to a virtual machine
Accessed through…	A file system	A block device that can be partitioned, formatted, and mounted (such as, /dev/vdc)	The REST API	A Shared File Systems service share (either manila managed or an external one registered in manila) that can be partitioned, formatted and mounted (such as /dev/vdc)
Accessible from…	Within a VM	Within a VM	Anywhere	Within a VM
Managed by…	OpenStack Compute (nova)	OpenStack Block Storage (cinder)	OpenStack Object Storage (swift)	OpenStack Shared File System Storage (manila)
Persists until…	VM is terminated	Deleted by user	Deleted by user	Deleted by user
Sizing determined by…	Administrator configuration of size settings, known as flavors	User specification in initial request	Amount of available physical storage	User specification in initial request Requests for extension Available user-level quotes Limitations applied by Administrator
Encryption set by…	Parameter in nova.conf	Admin establishing encrypted volume type, then user selecting encrypted volume	Not yet available	Shared File Systems service does not apply any additional encryption above what the share’s back-end storage provides
Example of typical usage…	10 GB first disk, 30 GB second disk	1 TB disk	10s of TBs of dataset storage	Depends completely on the size of back-end storage specified when a share was being created. In case of thin provisioning it can be partial space reservation (for more details see Capabilities and Extra-Specs specification)

1. Cephs. State of the art system which has a sort of logic web accessing to Linux . It gives all the parallelism and redundant data expected in a cloud or distributed systems. Reliability, no single point of failure and scalability. Based on CRUSH. Ceph seems to be more adapted to general purpose than HDFS. As other distributed systems the first step to build a distributed system is decoupled completely the data from its metadata. Ceph has tried to get ride off all clients understanding of the systems and use something similar to bash and libfuse libreries in order to access to files as a POSIX manner.

2. HDFS. Hadoop distribute file system. As the name shows is the file sysem of the Map Reduce implementation Hadoop. It is used in HBase too. Written in Java, it uses posix on the underlying system and is rack awareness by dns or ip. It is based on Master/Slave arquitecture so it have a Namenode or master who makes the filesystem operations via RPC interface the slaves are called Datanodes.

3. SWIFT. This is an Object Store; it pertains to Open Stack IaaS. Swift provide storage of Blobs via web access. The object store can be used to store data but a typical use case is the storage of images or videos. The link to that objects might be loaded in a traditional data base and the access would be by the web. The API is RESTful services with PUT or GET and the url with the path to the object. The swift is under HA to do this it is balanced using a load balancer + proxies to translate the request to the actual path to the object and its node. The objects are replicated so as to the request can be paralyzed.

4. CINDER. This is storage falls under the type called block storage. A cinder Volume is attached to the VM directly. You can thing in something like a USB attached to you laptop. However it is needed install the file system because the volume is in raw mode. So it requires as AWS ESB operational expertise.

5. HIVE. DataBase with MapReduce implementation and gives eventual consistence. It has HQL which is a kind of SQL and translates queries into MapReduce jobs. Hive uses a traditional SQL to store metadata. It can be MySQL, Oracle and other. Note not update or delete soported.

6. Amazon S3. It provides a service very similar as SWIFT do. So you have a web access through REST, It used to use Soap although now is backing off. S3 also provides a CLI commands in a bash way. The files can be up to 5 TB. They are stored in a concept called ‘bucket’. Users choose a key to map the object to fetch it. The organization of the bucket is quite flat so there is not organization at all and you can set as many files you want until 5 TB. As Swift it has a week consistency model called eventual consistency. This kind of consistence is the solution to the CAP problem consistency and this tries to give a final consistence consist on if all the writings stopped the system would get the expected value, But concurrent writings does not ensure consistent reads. So you shouldn’t set an ACID system like a bank account database. S3 also provides a CLI commands in a bash way.

7. Amazon AWS EBS. Elastic Block Storage. It is similar to the Cinder for Open Stack. Amazon provide a list of different types gp2,io1,st1,sc1 based on SSD or Magnetic disks and its IOPs capacity.

8. Amazon AWS Glacier. Archive service. As its name shows is to long storage to archive files. Archives are stored in ‘vauls’ and if you want download a file from it you are going to need 3 or 5 hours to get ready the files.

9. Amazon EFS. Elastic File System. It falls under the shared file system category. SSD backed so is fast. Amazon EBS or Cinder requires operation task to prepare the disks and install a file system. EFS provide a fully NFSv4 compliant network access whereas EFS is elastic and grows as needed.

10. Drop Box. Based on Amazon S3 and it uses Amazon EC2 for the business logic. It has a two levels of API. Drop-in to embed into Web UI and core API. It has OAuth v1 and v2

Apuntes cosas aprendidas de IT

dijous, 15 de setembre del 2016