Friday, February 22, 2019

Industry Practices and Tools 1


1.         What is the need for VCS?

Version Control System (VCS) are a category of software tools that help a software team manage changes to source code overtime. This keeps track of every modification to the code in a special kind of database. Here sue to due to any case an error occurred developers can fix them while minimizing disruption to all team members. Simply this is a system that records changes to a file or set of files overtime so that you can recall specific version later.

2. Differentiate the three models of VCSs, stating their pros and cons
Local version control system
There is a risk of loosing everything when there is a hard-disk failure though the entire history of the project saved. This is a kind of oldest VSC. This cannot used for collaborative software development. Here local database is managing on the system itself.

Centralized version control system
This set up offer many advantages especially over local VCSs. For example everyone knows to a certain extend what each members doing on the project. Administrators have access who can do what and it’s far easier to administer a centralized version control system than it is to deal with local database on every client.
When dealing with negative side of this due to any reason of the centralized server get down nobody can collaborate at all or save versioned changes to anything there working on. If the hard-disk where the central database is on becomes corrupted and proper backups haven’t been keep, it might loose absolutely everything.
Distributed version control system
Here clients don’t just check the latest snapshot of the files rather they fully mirror the repository including it’s s full history. If any server dies & these systems here collaborating via that server, any of the client repositories can be copied back to the server to restore it. Every clone is a full back up of all the data. Here we can deal with several remote repositories so that we can collaborate with different groups of people in different way in the same project.
This allows to setup several types of workflows that aren’t possible in centralized system such like hierarchical models.

3. Git and GitHub, are they same or different? Discuss with facts.

Git is a version control system(VCS) that keeps track of all the changes to your project files, allowing you to work with your team on the same code while saving you a lot of confusion that tends to happen when multiple people are editing some files. Git stores every record of every change that you or someone else makes to the project & this makes keeping track of your progress easy. If you need to look at your old code, or if you need to see what or who modified it the record is always there.
GitHub act as remote storage for your git repository & provides a really neat & simple way for people to collaborate & contribute to development projects. If you are looking for ways to contribute to open source, github is the place to go. If you are new to development, you can use github as a learning tool by analyzing popular code repositories.

4. Compare and contrast the Git commands, commit and push ?

Git commit records changes to the repository and this used in connection with your local repository.
Git push updates remote refs alone with associated objects and this used to interact with a remote repository.
Since git is a distributed version control system, the difference is that commit will commit changes to your local repository, where as push will push changes up to a remote repository. Git commit record your changes to the local repository.

5. Discuss the use of staging area and Git directory

Staging Area
Staging area is a place where you hold temporary tables on data warehouse server. Staging table are connected to work area or fact tables. We basically need staging area to hold the data and perform data cleansing and merging before loading the data in to warehouse.
A staging area is like a large table with data separated from their sources to be loaded into a data warehouse in the required format. If we attempt to load data directly from OLTP, it might mess up the OLTP because of format changes between a warehouse and OLTP. Keeping the OLTP data intact is very important for both the OLTP and the warehouse.
According to the complexity of the business rule, we may require staging area, the basic need of staging area is to clean the OLTP source data and gather in a place. Basically it’s a temporary database area. Staging area data is used for the further process and after that they can be deleted.

Staging area is a temp schema used to
u Do flat mapping
Ex: Dumping all the OLTP data in to it without applying any business rules. Pushing data into stying will take less time because there are no business rules or transformation applied on it.
u Used for data cleansing & validations using first layer
In the absence of staging area the data load will have to go  from the OLTP system to the OLAP system directly, which in fact will severely hamper the performance of OLTP system. This is the primary reason for the existence of a staging area. In addition, it also offers a platform for carrying out data cleansing.

Git Directory
The purpose of git is to manage a project or a set of files as they change over time. Git stores this information in a data structure called repository. A git repository contains among other things
1.   A set of commit objects
2.   A set of reference to commit object called heads.
The git repository is stored in the same directory as the project itself, in a sub directory called .git.
There is only one .git directory in the root directory of the project. The repository is stored in files alongside the project. There is no central server repository.


6. Explain the collaboration workflow of Git, with example
GitHub flow is a lightweight, branch-based workflow that supports teams and projects where deployments are made regularly. This explain how it flows.
CREATE A BUNCH
When you're working on a project, you're going to have a bunch of different features or ideas in progress at any given time – some of which are ready to go, and others which are not. Branching exists to help you manage this workflow.
When you create a branch in your project, you're creating an environment where you can try out new ideas. Changes you make on a branch don't affect the master branch, so you're free to experiment and commit changes, safe in the knowledge that your branch won't be merged until it's ready to be reviewed by someone you're collaborating with.

PROTIP

Branching is a core concept in Git, and the entire GitHub flow is based upon it. There's only one rule: anything in the master branch is always deployable.
Because of this, it's extremely important that your new branch is created off of master when working on a feature or a fix. Your branch name should be descriptive  so that others can see what is being worked on.

7. Discuss the benefits of CDNs

CDN is an interconnected system of computers on the internet that provides web content rapidly to numerous users by duplicating or caching the content on multiple servers & directing the content to users on proximity. The goal of a CDN is to serve content to end users with high availability & high performance.
CDN serve a large fraction of the internet content today including web objects, downloadable objects, applications, real-time streaming data, on demand streaming media & social networks.
When an end user requests a specific web page, video or file the server closet to that user is dynamically determined & is used to deliver the content to that user, thereby increasing the speed of delivery. Content may be replicated on hundreds or thousands of servers in order to provide identical content as many users as possible even during peak time.



Advantages of CDN

Companies that witness a huge traffic on their website on daily basis can use CDN to their advantage. When a large number of users simultaneously access a web page on some specific content such as a video, a CDN enables that content to be sent to each of them without delay. Here are few of the benefits of using a CDN for your website:

1. Your Server Load will decrease:
As a result of, strategically placed servers which form the backbone of the network the companies can have an increase in capacity and number of concurrent users that they can handle. Essentially, the content is spread out across several servers, as opposed to offloading them onto one large server.

2. Content Delivery will become faster:
Due to higher reliability, operators can deliver high-quality content with a high level of service, low network server loads, and thus, lower costs. Moreover, jQuery is ubiquitous on the web. There’s a high probability that someone visiting a particular page has already done that in the past using the Google CDN. Therefore, the file has already been cached by the browser and the user won’t need to download again.

3. Segmenting your audience becomes easy:
CDNs can deliver different content to different users depending on the kind of device requesting the content. They are capable of detecting the type of mobile devices and can deliver a device-specific version of the content.

4. Lower Network Latency and packet loss:
End users experience less jitter and improved stream quality. CDN users can, therefore, deliver high definition content with high Quality of Service, low costs, and low network load.

5. Higher Availability and better usage analytics:
CDNs dynamically distribute assets to the strategically placed core, fallback, and edge servers. CDNs can give more control of asset delivery and network load. They can optimize capacity per customer, provide views of real-time load and statistics, reveal which assets are popular, show active regions and report exact viewing details to customers. CDNs can thus offer 100% availability, even with large power, network or hardware outages.

6. Storage and Security:
CDNs offer secure storage capacity for content such as videos for enterprises that need it, as well as archiving and enhanced data backup services. CDNs can secure content through Digital Rights Management and limit access through user authentication.

8. How CDNs differ from web hosting servers?

Difference between CDNs and web hosting is that web hosting is used to host your website on a server and let users access it over the internet. Web hosting normally refers to one server. A content delivery network refers to a global network of edge servers which distributes your content from a multi host environment.
Web Hosting is used to host your website on a server and let users access it over the internet. A content delivery network is about speeding up the access/delivery of your website’s assets to those users.

Traditional web hosting would deliver 100% of your content to the user. If they are located across the world, the user still must wait for the data to be retrieved from where your web server is located. A CDN takes a majority of your static and dynamic content and serves it from across the globe, decreasing download times. Most times, the closer the CDN server is to the web visitor, the faster assets will load for them.

Web Hosting normally refers to one server. A content delivery network refers to a global network of edge servers which distributes your content from a multi-host environment.

10. Discuss the requirements for virtualization

1. Different platforms
2. Missing dependencies
3. Wrong configurations.

11. Discuss and compare the pros and cons of different 
virtualization techniques in different levels
Advantages of Virtualization

1.Using Virtualization for Efficient Hardware Utilization
Virtualization decreases costs by reducing the need for physical hardware systems. Virtual machines use efficient hardware, which lowers the quantities of hardware, associated maintenance costs and reduces the power along with cooling the demand. You can allocate memory, space and CPU in just a second, making you more self-independent from hardware vendors.
2. Using Virtualization to Increase Availability
Virtualization platforms offer a number of advanced features that are not found on physical servers, which increase uptime and availability. Although the vendor feature names may be different, they usually offer capabilities such as live migration, storage migration, fault tolerance, high availability and distributed resource scheduling. These technologies keep virtual machines chugging along or give them the ability to recover from unplanned outages.
3.Disaster Recovery
Disaster recovery is very easy when your servers are virtualized. With up-to-date snapshots of your virtual machines, you can quickly get back up and running. An organization can more easily create an affordable replication site. If a disaster strikes in the data center or server room itself, you can always move those virtual machines elsewhere into a cloud provider. Having that level of flexibility means your disaster recovery plan will be easier to enact and will have a 99% success rate.

4.Save Energy
Moving physical servers to virtual machines and consolidating them onto far fewer physical servers’ means lowering monthly power and cooling costs in the data center. It reduces carbon footprint and helps to clean up the air we breathe. Consumers want to see companies reducing their output of pollution and taking responsibility.
5.Deploying Servers too fast
You can quickly clone an image, master template or existing virtual machine to get a server up and running within minutes. You do not have to fill out purchase orders, wait for shipping and receiving and then rack, stack, and cable a physical machine only to spend additional hours waiting for the operating system and applications to complete their installations. With virtual backup tools like Veeam, redeploying images will be so fast that your end users will hardly notice there was an issue.
6.Save Space in your Server Room or Datacenter
Imagine a simple example: you have two racks with 30 physical servers and 4 switches. By virtualizing your servers, it will help you to reduce half the space used by the physical servers. The result can be two physical servers in a rack with one switch, where each physical server holds 15 virtualized servers.
7.Testing and setting up Lab Environment
While you are testing or installing something on your servers and it crashes, do not panic, as there is no data loss. Just revert to a previous snapshot and you can move forward as if the mistake did not even happen. You can also isolate these testing environments from end users while still keeping them online. When you have completely done your work, deploy it in live.


8.Possibility to Divide Services
If you have a single server, holding different applications this can increase the possibility of the services to crash with each other and increasing the fail rate of the server. If you virtualize this server, you can put applications in separated environments from each other as we have discussed previously.
Disadvantages of Virtualization
1.   Extra Costs
Maybe you have to invest in the virtualization software and possibly additional hardware might be required to make the virtualization possible. This depends on your existing network. Many businesses have sufficient capacity to accommodate the virtualization without requiring much cash. If you have an infrastructure that is more than five years old, you have to consider an initial renewal budget.
2.   Software Licensing
This is becoming less of a problem as more software vendors adapt to the increased adoption of virtualization. However, it is important to check with your vendors to understand how they view software use in a virtualized environment.
3.   Learn the new Infrastructure
Implementing and managing a virtualized environment will require IT staff with expertise in virtualization. On the user side, a typical virtual environment will operate similarly to the non-virtual environment. There are some applications that do not adapt well to the virtualized environment.
Desktop virtualization
Desktop virtualization is an increasingly important technology for many organizations. A virtual desktop means that a user’s desktop environment is stored remotely on a server, rather than on a local PC or other client computing device.
1.   Cost Savings
From an IT perspective, virtual desktops help reduce the time it takes to provision new desktops, and they also help to decrease desktop management and support costs. Experts estimate that maintaining and managing PC hardware and software accounts for 50-70% percent of the total cost of ownership (TCO) of a typical PC. Companies often turn to virtual desktops to cut these IT labor costs.

2.   Simplified management
Since everything is centrally managed, stored and secured, virtual desktops eliminate the need to install, update and patch applications, back up files and scan for viruses on individual client devices. Desktop virtualization also helps to streamline management of software assests.
3.   Enhanced Security
Virtual desktops provide greater security to an organization because employee aren’t carrying around confidential company data on a personal device that could easily be lost, stolen or tampered with.

4.   Increased productivity
Virtual desktops allow employees to access applications and documents from multiple devices, including other desktop computers, laptops … This increase productivity by allowing workers to access necessary data from anywhere.

12. Identify popular implementations and available tools for each level of visualization

Desktop virtualization generally consists of 5 modules. Those modules consists of virtual machines on which desktop instance will run, image repository which hold image data, management server which manages whole   infrastructure and users and management console, finally end user and their connectivity to the network. At the 4th module, Management console it is provides by management server for end user. End user who want virtualized remote desktop can log in and make request for the specific or OS desktop assigned to them. At the 5th , End user module not only consists of person and end node whish success the virtualized desktop but also the network which connect that node to the many server and image store.
Tools : Virtual Box, VM ware server, Parallels, Xen


13. What is the hypervisor and what is the role of it?

Hypervisor or virtual machine monitor (VMM) is computer software, firmware or hardware that creates &n run virtual machine. The hypervisor presents the guest operating system with a virtual operating platform & manages the execution of the guest operating system
u Provide an environment identical to the physical environment.
u Provide that environment with minimal performance cost.
u Retain complete control of the system resources.

14. How does the emulation is different from VMs?

Virtual machines make use of CPU iself –virtualization,  to whatever extent it exists, to provide a virtualizes interface to the real hardware. Emulators emulate hardware without relying on the CPU being able to run code directly & redirect some operations to a hypervisor controlling the virtual content,.
Both aim for some level of independence from the hardware of the host machine, but a virtual machine tends to simulate just enough hardware to make the guest work and do so with an emphasis on efficiency of the virtualization/emulation. Ultimately the VM may not act like any hardware that really exists and may need VM specific drivers, but the set of guest drivers will be consistent across a large no: of virtual environments.
An emulator on the other hand tries to exactly produce all the behavior including quick and bugs of some real hardware being simulated.






 15. Compare and contrast the VMs and containers/dockers, indicating their advantages and disadvantages
  
VMs
Containers
1. Heavy weight
2. Limited performance.
3. Each VM runs in its own OS
4. Hardware level virtualization
5. Start up time in minutes
6. Allocated requirements memory
7. Fully isolated & hence more secure
1. Light weight
2. Native performance
3. all containers share the host OS
4. OS virtualization
5. Start up time in milliseconds
6. Requires less memory space
7.Process level isolation, possibly less secure.
Both these have benefits as well as drawbacks as mentioned in the above table.
VMs are better choice for running apps that require all the OS resources & functionality, when you need to run multiple applications or servers or have a wide variety of OS to manage
Containers are a better choice when your biggest priority is maximizing the number of applications running on a minimal no: of servers.




No comments:

Post a Comment