Basic premise:
There is a basic gap in services that are provided to the market, this is evident in two ways, firstly, all "home cloud" solutions appear to be not really cloud based at all (as I would see the definition of a cloud based service).

The definition that I would use to define a cloud based offering is not simply one that exists in the internet, but on that resiliently exists "anywhere" in the internet.
As the Internet is drawn as an amorphous blob/cloud on networking diagrams to suggest that the route taken to a physical host can go practically anywhere, so it seems that a cloud service must be able to be practically anywhere. - it cannot simply be a fixed physical host at a single location. -that's physical hosting, not cloud hosting!

Companies that offer enterprise level cloud hosting will generally have a set-up where there are host machines that runs some service, (HTTP/FTP etc). -usually more than one machine working in a clustered set-up such that general OS issues do not cause downtime.
More peculiar is that the machines aren't real, it's a virtual machine that runs on a cluster of hyper-visors, if one physical machine fails, then the machine (and hence services) fail over seamlessly to an alternative machine.
there are multiple power connections, backup generators, multiple internet connections from multiple providers.

If a plane crashes into the data centre, everything is replicated to a datacentre down the road. so downtime is still limited, data is still safe.
For people who buy the services there is no worry about accessing their files as their files do exist "somewhere in the cloud" -not just at one physical site.

A lot of the "home cloud hosting solutions" just aren't cloud solutions, the are traditional single machine hosting, where your home broadband may fail, power may be cut, pets may chew through cables, leaving you unable to access your files.

The proposed solution to this problem will be to create a "real home cloud" solution, this project will be largly software driven creating some clinet/server software that enables machines to join in a network and replicate files amongst themselves.

people can put very low power file servers at their home and at their parents house, friends houses etc, small businesses can have file servers at the directors house, and at the office to ensure that offsite backups are constantly taken and that those files are constantly available.

Users can choose to create private clouds for their own files, or users can choose to join or create public clouds for the largescale districution of ideas to the pubic domain.


Why do I want to do this?

I do not believe that there is an adequate provision on non-corporation run repositories for physical objects. Users have very little protection about change of user and terms of service, and can find themselves helping to create valuable intellectual property that a company will call their own and use to their advantage.

This is typified by the "thingiverse issue" whilst I have no strong views on the thingiverse, or indeed the buyout of Makerbot by Strasys, the changing of terms on the thingiverse, coverage has shown that a lot of people do [have strong views]

People have pulled designs, uploaded useless designs in place or working designs etc as a protest, other manufacturers are attempting to re-create the thingiverse but tied to their specific brand of printer. Third party companies are attemptting to fill the gap in the market (lack of non vendor hosting sites, however there is a perception that so long as the hosting site is a company there is always the option that they will change their terms of use. get bought out or get shut down.

My project has two major goals.
1, To bring large scale distributed "cloud" services to home users.
2, To use that technology created to create a large scale community driven, community hosted project repository. -where people may store projects and design files.



This sounds like a software project?!
These is undoubtably a large software component to the project, however the end goal of the poject is to create a device that any person could walk into the shop and buy 2 of, reasonably cheaply leaving one at their own house and one at a friends or parents house and have their work available online more readily, or have their projects backed up.

Current commercial offerings (such as the western digital cloud devices) do not appear to offer cloud hosting - only phsical server hosting, and do not allow for the distribution of services. 

Technical detail:
Architecture

The network will be made of various types of nodes.
Super nodes
Nodes
Peers/clients

A person who wishes to establish a network must first install the client software.

They will then select to establish a new network.
They will give the name of their network entry point

They will then select the files and or services that they wish to distribute.


These files and services will be called pools, each pool will receieve a unique name, for the pool and the network for which it resides within.

The user will then select whether they wish the pool to be public (shared with all) or private (shared only with those who have username/password
The only action that a user needs to take is configure a port and open that port on their firewall. -to this end there is no greater technical ability required than setting up any other home cloud solution.

Super nodes act as gateway machines, they should have a named entry point (e.g. gateway.noip.com)

These nodes must contain a network map of other supernodes and nodes within the network.
When a client connect to a supernode the supernode should advertise all public nodes known to it.

Authentication may be required on the node, this authentication can be by way of username and password of SSL certificate based authentication.

Once a client has recieved a list of servers in a pool the user should be prompted as to what pools they wish to either privatly replicate, (download only) or publically mirror (up and down) files within the network.

At this point users may also manually request to join hidden networks (this would be simillar to joining a wirelessnetwork with a hidden SSID) the requests to join the network (and passwords or certificate parts) needed to join the pools will be sent.

The gateway supernode servers will autheticate the right to join the network (if applicable) and at this point send a list of files within the directory and a list of places those files amy be downloaded from.

A client, (or node, or new supernode) should then elect to download files from other nodes within the network.




Parallel downloads.
Files should be downloaded in chunks, such that the first half of a file can be downloaded from node1 and a second half downloaded from another peer.
The reason for this is most home users have ADSL only where 1 up 10 down ratios may be in effect. additionally users may limit their upstream bandwidth - trying to get all of a file from one user may be infeasible!

Anti poisoning.
each node should ask (if possible) for a hash of a file part from 2 servers.
these md5 hashes should be compared, if they do not match a third server should be asked to be an arbitrator at this point the file that send the correct MD5 hash should have the file part requested from it,
once the download is complete the file part should be hashed and that hash compared to the original hash sent.

In this regard it should become impossible for a host to upload corruptted files, or for a person to deliberatly poison the file pool replacing known files with a virus.

Reporting
A host may choose to report to a super node a poisoning attempt, once a supernode hsa recieved enought reports it may choose to silently drop the node from the network no longer sending new clients to that node.

Uploads
Users who create pools can choose to create public upload pools (e.g. anyone can upload).
or pools where only the pool creator can upload files - e.g. only files uploaded to a gateway website or only core developers on a dev team.


Repudiation vs. Non-Repudiation
Users can choose to upload anonymously. or as a named used - with an SSL certificate to identify users.

end users may choose whether to acept files without owners or not.

- e.g. this may be usefull for groups that wish to upload files where they wish to disaccociate themselves personally from a file (leaks, dissident information etc).
but may also create a situation where would be replicators want nothing to do with the content uploaded -the easy example being child porn.





Network reslience
At any point all nodes should know the address of other nodes within the network, clients need to know the addresses and service ports of those nodes that are included in the pool such that they can check on those servers for file updates. This means that peer level clients may be promoted to node status simply by configuring the relevant firewall ports and telling the software to increase the service level, at this point a message would be sent to the gateway supernodes in the network, and the address distributed to other clients.

Node may be promoted to supernodes with a simillar software upgrade service offerings configuration change.

This will create a new entry point to the network, the named address of the network node is not important. (e.g. can be anything) what is important is that the configuration is shared throughout the network.

e.g. nodes will conenct to gateway.com and will learn of other node entry points. gateway2.com gateway3.com
in this way if gateway.com is taken offline for any reason the network will live on, and new nodes may continue to rejoin the network as when they cannot connect to gateway.com they will try to connect to gateway2.com and gateway3.com

this however means that ay information passed by word of mouth may be out of date -e.g. if I tell you you should connect to gateway.com after it is taken offline then this will not be possible and your freshly installed client will not know about gateway2.com or gateway3.com

gateways known should be displayed to the user in some fashion such that information can be passed between people. (e.g. in big letters on a control panel.


Nodes may not be promoted to supernodes unless they meet a minimum service offering.
minimum service offerrings are:
Ability to connect on a simple port (port 80 should be the default port)
Ability to serve DNS (as must be able to serve addresses for service offerrings - e.g point to web of FTP servers)

There is no reason that a supernode must partake in all file or service sharing pools on a network (a supernode can choose to only act as a gateway)





Programming language
the program will be written in C, and will provide a web interface for configuration.

the reason that C is chosen is that it will compile on most (all) interested hardware platforms.

-initially I will develop on a Raspberry Pi with the goal to create a stanalone appliance disk caddy where a hard drive can simply be plugged in and given a network connecttion.

however there is no reason to explude the use of beagle bone boards, or existing x86 x64 servers.

-the goal is to provide a powerful and distruibuted file and service mirroring platform.



Electronics,
Principally the test and development system will be created around a Raspberry Pi model B board, a commercial USB Sata controller and a laptop SATA hard drive.

The finished product will be based around the Raspberry pi compute module, pins will be broken out for activity lights that will indicate the status of the archive, optionally an LCD screen may be

used to indicate disk space used, connectivity and used for setting basic parameters (establishing IP address etc)

A series of enclosures will also be designed.
As a connecrcial product an extruded aluminium enclosure will be a viable mass production proposition.

For small "DIY" users a simple 3d printed case will be designed.



End goal
The end goal is to produce a globally distributed network of file control without ownership. In order to share free (as in freedom and beer) open source ideas and designs.

The network should not be exclusive -it does not matter if you have datacentre hosting, or are hosting at home. the network will not discriminate any more than it's users configure (via upload/downoad limits)

the hardware will not be exclusive - you will not have to buy a commercial product.
You will not have to have access to a 3d printer or any specialist tooling

It should be possible for users to simply buy a raspberry pi and join the network (sharing files and services from the SD card)

However far more utility and long term reliability would be gained from using hard drive based storage.


Philosophical goal,
The philosphical goal is to enable the "power of cloud computing" -in terms of distribution and uptime to be available properly in homes and small businesses.
Principally the goal of the first network of shares and services will be to create a open source, community driven website as a 3d objects repository.

Where users who join the distributed service will host a web service and source files of 3d objects, an end user will connect to the site, and each object may be downloaded from a different server, (not disimillar from sites which load some content from one server and other content from an image repository etc, fetching files from 2 servers simultaneously to build a site.



As the long term goal of the project is to encourage open sharing of ideas and sharing the knowledge and ability to create things this project should not be exclusive beginner skills.
-it might be that a future development sees a microsized board to operate a 1.5" disc drives, of a multi layer board, however creating this is a commercial goal, and will not be the core focus of this project.
The end goal is to create designs for hardware that any person of beginner level electronics can follow and create this device, (in this case beginner is defined as being able to put together Velleman style kits. - e.g. can solder.)

Licensing,
I have not yet decided on a license to use, Ideally I would like a licence that creates open availability for users to copy and create their own devices, yet would allow commercial licensing for businesses that wish to integrate the technology into their on devices. (suggesttions of a license that would allow this are welcome!)





Realistic plans
Within the initial three months the development plan is as follows:

A board for the Raspberry pi Compute module will be designed.

A board for the Raspberry Pi Compute module will be created -this board will be a home made trial.


A 3d printable encosure for a basic setup (raspberry pi model B + USB Sata controller + SATA drive) will be designed and created.
A 3d printable enclosure for a raspberry pi compute module based board will be designed and created.


The initial file sharing software should be written and tested.
at the start of the project each device will need to choose and configure the website software on the nodes individually.

The control panel for determining what networks to join should be created (easy user control)

Anti poisoning measures (to stop file corruption) should be created.




longer term goals.

Certificate files for file ownership and authentication on the network should be created to allow nodes to develop trusts.

Software for creating website configuration to allow a user to select to join a network, but have no in depth knowledge of setting up a webserver.

Software for updating sites and services, (after allowing people to create and forget about webservers, automatic updates of the cloud services and servers will be necessary.


A basic tool pattern for a die for creating an extruded aluminium case wil be designed. (this would essentially be a profile of the case).