May 17, 2021 MongoDB
2. The relationship database follows THED rules
4. The benefits of distributed computing
5. The disadvantages of distributed computing
NoSQL (NoSQL s Not Only SQL), which means "more than SQL".
In modern computing systems, a huge amount of data is generated on the network every day.
A large part of this data is handled by the Relationship Database Management System (RDMBSs). The 1970 E.F. Codd's paper on relational models, "A relational model of data for large shared data banks", makes data modeling and application programming easier.
Application practice has proven that the relationship model is well suited to client server programming, far beyond the expected benefits, and today it is the dominant technology for structured data storage in network and business applications.
NoSQL is a revolutionary new database movement, and it was suggested early on that the trend was on the rise until 2009. NoSQL advocates advocate the use of non-relational data storage, a concept that is undoubtedly an injection of new thinking over the use of relational databases.
Transactions are transaction in English, similar to real-world transactions, and have four characteristics:
1, A (Atomicity) atomicity
Atomicity is easy to understand, that is, all operations in a transaction are either done or not done, the condition for the success of the transaction is that all operations in the transaction are successful, as long as one operation fails, the entire transaction fails and needs to be rolled back.
For example, bank transfer, from the A account to 100 yuan to B account, divided into two steps: 1) from the A account to withdraw 100 yuan; These two steps are either completed together, or not completed together, if only the first step is completed, the second step fails, the money will somehow be 100 yuan less.
2, C (Consistency) consistency
Consistency is also easier to understand, which means that the database remains in a consistent state and that the operation of the transaction does not change the original consistency constraints of the database.
For example, if a transaction changes a, you must change b so that the transaction is still satisfied after the end of the transaction, or the transaction fails.
3, I (Isolation) independence
So-called independence means that the data that a transaction accesses is not affected by each other, and if the data that one transaction wants to access is being modified by another transaction, the data it accesses is not affected by the uncommitted transaction as long as the other transaction is not committed.
For example, there is an existing transaction from the A account to the B account, in this case the transaction has not been completed, if B query their account at this time, is not see the new increase of 100 yuan.
4, D (Durability) persistence
Persistence means that once a transaction is committed, the modifications it makes are permanently saved on the database and are not lost even in the event of an outage.
A distributed system consists of multiple computers and communication software components connected over a computer network (local network or wide area network).
A distributed system is a software system built on a network. It is precisely because of the characteristics of software that distributed systems are highly cohesion and transparent.
As a result, the difference between a network and a distributed system is more in high-level software (especially the operating system) than in hardware.
Distributed systems can be used on different platforms such as PCs, workstations, local area networks, and wide area networks.
Reliability (fault tolerance):
An important advantage in distributed computing systems is reliability.
A system crash on one server does not affect the rest of the servers.
Scalability:
More machines can be added as needed in distributed computing systems.
Resource sharing:
Sharing data is essential for applications such as banking and booking systems.
Flexibility:
Since the system is very flexible, it is easy to install, implement and debug new services.
Faster speed:
Distributed computing systems can have the computing power of multiple computers, making it faster to process than other systems.
Open systems:
Because it is an open system, the service can be accessed either locally or remotely.
Higher performance:
Higher performance (and better price/performance ratio) can be provided compared to centralized computer network clusters.
Troubleshooting: :
Troubleshooting and diagnosing problems.
Software:
Less software support is a major drawback of distributed computing systems.
Internet:
Network infrastructure issues, including: transmission problems, high load, loss of information, etc.
Security:
The characteristics of the development system make the distributed computing system have the risk of data security and sharing.
NoSQL refers to a non-relationship database. NoSQL, sometimes referred to as the acronym for Not Only SQL, is a generic term for a database management system that differs from traditional relationship databases.
NoSQL is used for the storage of ultra-large-scale data. ( Google or Facebook, for example, collect trillions of bits of data for their users every day.) These types of data stores do not require fixed patterns and can scale out without redundant operations.
Today we can easily access and crawl data through third-party platforms such as Google, Facebook, etc. U sers' personal information, social networks, geographic locations, user-generated data and user action logs have multiplied. If we want to mine these user data, then SQL database is no longer suitable for these applications, noSQL database development can also handle these large data very well.
Social Network:
Wikipedia page:
Rdbms
- Highly organized and structured data
- Structured query language (SQL) (SQL)
- Data and relationships are stored in separate tables.
- Data manipulation language, data definition language
- Strict consistency
- The underlying transaction
Nosql
- Represents more than SQL
- There is no declarative query language
- There are no predefined patterns
-Key - Value-to-store, column store, document store, graphics database
- Final consistency, not ADID attributes
- Unstructured and unpredictable data
- CAP the therm
- High performance, high availability and scalability
The term NoSQL first appeared in 1998 as a lightweight, open source, non-SQL-enabled relationship database developed by Carlo Strozzi.
In 2009, Last.fm's Johan Oskarsson launched a discussion about distributed open source databases, and Eric Evans from Rackspace re-introduced the concept of NoSQL, which refers primarily to non-dnational, distributed, database design patterns that do not provide ACIDs.
The "no:sql" seminar held in Atlanta in 2009 was a milestone, with the slogan "select fun, profit from real_world where relational s false; " 。 Therefore, the most common interpretation of NoSQL is "unrelated" and emphasizes the advantages of Key-Value Stores and document databases, rather than simply opposing RDBMS.
In computer science, cap theorem, also known as brewer's theorem, states that for a distributed computing system, it is not possible to meet the following three points at the same time:
The core of CAP theory is that a distributed system can not meet the three requirements of consistency, availability and partition fault tolerance at the same time, and can only meet two at most.
Therefore, according to cap principle, the NoSQL database is divided into three categories: meet the CA principle, meet the CP principle, and meet the AP principle:
Advantages:
Disadvantages:
BASE:Basically Available, Soft-state, Eventually Consistent。 Defined by Eric Brewer.
The core of CAP theory is that a distributed system can not meet the three requirements of consistency, availability and partition fault tolerance at the same time, and can only meet two at most.
BASE is a weak requirement for availability and consistency in NoSQL databases:
ACID | BASE |
---|---|
Atomicity (A tomicity) | Basic available (B asically A vailable) |
Consistency (C onsistency) | Soft State/Flexible Transactions (S oft State) |
Isolation (I solation) | Final consistency (E ventual consistency) |
Persistence (D urable) |
Type |
Part of the representative
|
Characteristics |
Column storage |
Hbase Cassandra Hypertable |
As the name implies, data is stored by column. The biggest feature is convenient storage of structured and semi-structured data, convenient data compression, for a column or a few columns of queries have a very large IO advantage. |
Document storage |
Couchdb |
Document storage is typically stored in a jason-like format, and the content stored is document-type. This also gives you the opportunity to index certain fields and implement some of the functionality of the relationship database. |
Key-value storage |
Tokyo Cabinet / Tyrant Berkeley DB MemcacheDB Redis |
You can quickly query its value with key. I n general, storage, regardless of the format of the value, is charged in full. (Redis includes additional features) |
Figure storage |
Neo4J FlockDB |
The best storage for graphical relationships. Using traditional relationship databases to solve these problems is poor and designed to be inconvenient to use. |
The object is stored |
db4o Versant |
Access data through objects through syntax operation databases similar to object-oriented languages. |
xml database |
Berkeley DB XML BaseX |
Efficient storage of XML data and support for XML internal query syntax, such as XQuery, Xpath. |