Tag: Storage service center

Starfish Zones Data Visualization Tool

Starfish Zones Data Visualization Tool

Overview:

Starfish Zones is a self-service visual interface that allows groups to view folder storage amounts and locations. Users can navigate through the folder structures in the dashboard to explore directory and file level details, including storage amounts, last accessed and modified times, file owners, and file counts. The tool is still under development and may experience short downtimes to accommodate modifications.

Login:

To access the dashboard, navigate to https://starfish.rc.fas.harvard.edu/. You will need to be on Harvard VPN, FASRC VPN, or a wired on-campus connection. See VPN Setup for how to connect to FASRC VPN.

After navigating to the website, input your FASRC account name and password. If you have issues with your FASRC password, please visit the FASRC website.

You will need to be a member of the folder’s user group to gain access. If you are not a member of a FASRC user group, please email FASRC Help to be added. If you are a new faculty or group owner, it may take some time before your information is fully populated in FASRC systems. You will also need an active FASRC supported storage folder to receive a Starfish Zone.

Navigation:

Once logged in, you will be able to view all storage folders associated with your group. Note: If you notice a storage folder is missing from your dashboard, please email FASRC’s Research Data Manager and it will be added to your view.

 

By double-clicking on the selected folder path, you can drill down into the folder down to the file level. Users can modify what information is displayed on the dashboard by right-clicking on the column headers. All available column selections will be shown.

 

The dashboard is updated on a consistent basis. You can view when your Zone was last updated on the upper right-hand corner of the dashboard, where a date and time will be listed. If no modifications have been made to the folder contents, the updated time will reflect the last time changes were made to the folder.

Export:

The dashboard allows users to export the information as a CSV file. At the top of the zone is a “Download CSV” option. Users can select which columns they would like included in the downloaded spreadsheet. Some of our suggested columns include:

  • Count (number of files)
  • Path (folder path)
  • Logical size (dataset size)
  • Newest accessed (tree)
  • Newest modified (tree)

Contact:

If you have any additional questions about how to login or utilize the Starfish Zones dashboard, please email Sarah Marchese, FASRC Research Data Manager.

Coldfront – Allocation management

Coldfront – Allocation management

Please check the Storage Service Center page for the dataflow to ColdFront

ColdFront is an open-source resource allocation management system designed to provide a central portal for administration, usage reporting, and allocation management of HPC resources.  FASRC adapted the open-source software to manage allocations on the FASRC cluster.  The platform enables the viewing and management of both lab groups (Projects) and their storage or cluster allocations (Allocations).

Accessing Coldfront

To access Coldfront, connect to the @fasrc VPN and log in using FASRC credentials.

https://coldfront.rc.fas.harvard.edu/

 

After logging in, you will see your home page, which has sections for your projects, your allocations, and any pending requests or change requests for your allocations. Click the project link or Allocation active button to view details about your projects/allocation.

 

Project Pages:

A Project page allows all project members to:

  • view the project’s allocations
  • view the project’s users
  • adjust their project notification settings.

Additionally, from the project page, project managers and PIs can:

  • request new storage allocations
  • request changes to existing storage allocations
  • add users to the project
  • remove or request to remove users from the project
  • edit the roles of project users (i.e., assign or remove Manager status)
  • send an email to project users that have elected to receive notifications.

The project page’s allocation section allows you to view storage and cluster allocations for your lab. Managers can also click the “Request New Storage Allocation” button on this table’s header to… yes, that’s right, submit a request for a new storage allocation.

The project page’s users table lists all the users in the lab group. The users table contains options for PIs and managers to add users to the project, remove or request to remove users from the project, edit the roles of project users (i.e., assign or remove Manager status), or send an email to project users that have elected to receive notifications.

 

Allocations:

The Allocation Page provides a comprehensive view of details about the allocation, presenting key information such as the total allocation size, overall usage, and the estimated monthly cost. This page also features a table that illustrates usage per user, with data sourced and updated daily from our data management system, Starfish.

 

Making an Allocation Request:

PIs and users with project manager status can make a new allocation request or request changes to the current allocation.

To request a new allocation:

  1. Go to the page of the project the new allocation is for and click the Request New Storage Allocation button.
  2. Fill out and submit the allocation request form.
  3. You will be notified via email when your new allocation is ready to use.

Allocations can be requested on storage tiers 0-3. To explore and understand the specifics of each storage tier, please refer to our detailed documentation on storage tiers here.

Making an Allocation Change Request:

PIs and project managers can request to change the size of an allocation associated with their project. Follow these steps to initiate the process:

  1. Navigate to the allocation page corresponding to the allocation you wish to modify.
  2. Click the “Request Change” button at the top of the “Allocation Information” table.
  3. In the resulting form, shown below, enter the desired size of the allocation and the justification for those changes and click submit.
  4. You will be notified via email when the allocation is updated and ready to use.

 

Please review https://www.rc.fas.harvard.edu/services/data-storage/#Offerings_Tiers_of_Service for the storage features and updates. If you have any questions. please feel free to reach us here: rchelp@rc.fas.harvard.edu

Storage Service Center

Storage Service Center

This page provides the necessary information for requesting, managing data storage allocations, and billing.  It is essential that you review the Storage Billing FAQ and Data Storage Service pages.  Below we provide information on the three different software applications we use to help PIs (Coldfront), Finance managers (FIINE), and Lab/Data managers (Starfish) perform their roles.

To get more information about the storage tiers and their features please visit storage tiers on our data-storage services page. Please feel free to reach out to us at  rchelp@rc.fas.harvard.edu with any questions.

STORAGE TIERS and COST

Tier 0 Tier 1 Tier 2 Tier 3
Description: High-performance Lustre Enterprise Isilon NFS Storage Tape
Cost per TB/month (rounded down): $4.16 ($50/yr) $20.83 ($250/yr) $8.33 ($100/yr) $0.416 ($5/yr)
Snapshot: ** No Yes No No
Disaster Recovery: * No Yes Yes No
Available for new allocations: Yes Yes Yes Yes
Maximum size per share: N/A N/A 306 TB 20TB per tape (19.48TB usable)

* Disaster Recovery (aka DR) means that the entire share can be restored in the event of hardware failure or other ‘disaster’. This DRE copy is not accessible to the end user and not suitable for recovering individual files that have been accidentally deleted.

** Snapshot means that a snapshot of the filesystem is periodically taken (for up to 7 days) and can be accessed by the end user to recover individual files that are accidentally deleted.

Important notes:

  • If you require more than approximately 100TB in a single allocation, please contact us first to discuss or drop by Office Hours
  • Billing is through the FIINE system. See below to get access to FIINE
  • Billing is done monthly
  • The cutoff day for billing is the 15th of each month. Any changes done to the allocation will be reflected in next month’s bill
  • The service center needs 33 digit billing code to provide the service. It’s an internal service so we can’t create POs for billing
  • To read your current bill read here
  • For billing questions and queries email billing@rc.fas.harvard.edu

REQUEST OR MANAGE AN ALLOCATION

NEW OR EXISTING TIERED ALLOCATIONS

  • To request an allocation or manage an existing one, the PI (or a previously designated storage Manage) should log into Coldfront
  • If you cannot access Coldfront, or you are a PI who would like to designate a storage Manager for your lab, please contact FASRC

Lead Time for New Tape(Tier 3) Allocations

There is a minimum setup time of roughly 2 weeks for new tape allocations. This timeframe assumes we receive the completed tape setup from our service partner NESE without delay. Delays there are beyond our control and could increase lead time. Please note that any storage changes made after the 15th of the month will be reflected in the following month’s billing.

MANAGE BILLING FOR ALLOCATIONS

Charges for storage allocations are billed monthly. Expense code(s) can be applied to each allocation and can be sub-divided among multiple billing codes.
See our Service Center FAQ for answers to common questions.

See also How to read your Storage Service Center bill

To manage billing for an existing allocation, you will need:

Instructions for expense code management and billing record review in FIINE are available at:
https://ifx.rc.fas.harvard.edu/docs/user/fiine.html

 

Starfish – Data Management

Starfish – Scans the different storage servers to provide a view, usage details, metadata, and tagging based on the projects. Check here for more details about starfish and examples to query the data.

See also our guide to Data Management Best Practices

Coldfront – Lab and Allocation management:

Coldfront – Provides a view on PI projects and allocations. New allocation and updates to existing allocation can be requested using Coldfront. Check here for more details about Coldfront and its use.

Fiine -FAS Instrument Invoicing Environment

Fiine – For lab/finance administrator to manage the expense codes per project/user and view invoices. Check here for more information about using the Fiine system.

 

FAQ – Storgae Service Center

Since the growth of storage has increased tenfold in the past 5 years, hosting individual small capacity storage server deployments has become unsustainable to manage. These individual server systems do not easily allow for the growth of data share. Due to their small volume, many systems are run above 85% utilization which degrades the performance.

Many systems also run beyond their original maintenance contract, which causes issues in sourcing parts to make repairs; older systems (>5yr) increase the risk for catastrophic data loss. Some systems were purchased by PIs without a provision for backup systems, which has led to confusion of which data shares should have backups. Our prior backup methodology does not scale to these larger systems with hundreds of millions of files. Given these historical reasons, revamping our storage service offerings allows FASRC to maintain the lifecycle of equipment, allowing us to project the overall growth for data capacity, datacenter space, and professional staffing to maintain your research data assets safely.

Prior to the establishment of a Storage Service Center, we only offered a single NFS filesystem for your Lab Share; you now have the choice of four storage offerings to meet your technology needs. The tiers of service clearly define what type of backup your data will have. You only have to pay for an allocation capacity that you need, as opposed to having to guess at the beginning of a server purchase and have this excess go unused.

Over time, you can request an increase to your allocation size. You will receive monthly reports on utilization from each tier to help you plan for future data needs. Some of our tiers will also have web-based data management tools that allow you to query different aspects of your data, tag your data, and see visual representations of your data.

Unlike the compute cluster, where resources are reserved and released, data is allocated to storage long-term. In addition, storage needs across various research domains is drastically different. Therefore, in the FY19 federal rate setting, FAS decided to remove the portion of FASRC dedicated to maintaining storage out of the facilities part of the F&A. This allows FAS to run a Storage Service Center with costs that are allowable on federal awards.
Information about the storage offerings can be found on our Storage Services page and Storage Service Center document . Requests for storage allocations can be made through Coldfront (FASRC VPN required). . Please keep in mind that large requests (>100 TB) might not all be available at the time of request and a smaller increase will be applied as we add more capacity in the coming month.
Yes, you can allocations in different storage tiers to meet your needs and budget.

We have worked with RAS on two allocation methods to charge data storage to your grants (1) per user allocation method (2) per project allocation method.

Per use allocation method: You will be supplied a usage report by the user for each tier. You can use the % of data associated with this individual as the cost and use the same cost distribution of their % effort on grants.

Example 1: PI has 10 TB allocation on Tier 1 in which researchers John and Jill use. The monthly bill for 10 TB of Tier 1 is $208.30 (at $20.83/TB/mo). The usage report shows that 8 TB total usage where John usages is 60% and Jill is 40%. So data charges associated with John is $124.98 and with Jill is $83.32. John is funded 50% on NFS and 50% on the NIH project thus $62.49 should be allocated to each grant. Jill is funded 100% on NSF project, thus $83.32 should be allocated to her NSF grant.

This method allows faculty to manage their data structures independently to the specific projects as multiple projects will be using some of the same data. Keep in mind that as researchers leave, there needs to be a plan for their data as this data will continue to be reported on in the usage reports.

Per project allocation method: If requested a project specific report, you will have a direct mapping of data used by this project and can allocate this full cost to the cost distribution from grants.

Example 2: PI requests new 5 TB allocation on Tier 1 for NSF funded project. 10 users share this data. The monthly bill would include Tier 1 of $54.15 (at $20.83/TB/mo). The entire $54.15 would be charged to the NSF grant.

This allows there to be a very straightforward assignment between data and funding source. Reuse of the active parts of this data will need to be assigned to future projects.

Example 3: The above PI also has 100 TB allocation on Tier 0 used for multiple projects with multiple funding sources. The usage report for the Tier 0 would be provided per user as per Example 1 above, and the % effort allocation method would be used for Tier 0, while the Example 2 would be used for the new project on Tier 1.

As is common with other Science Operations Core Facilities, once funding sources have been established for bills, we will continue to direct bill those funds until the PI updates these distributions. For the first few months billing will be manual via email until the new Science Operations LIMS billing system is complete.

We suggest that a data management plan is established at the beginning of a project, so that a full data lifecycle can be mapped to phases of your data. This helps identify data that will need to be kept long-term from the start, as well as helps mitigate data being orphaned when students and postdocs move on. If research data is being used again in a subsequent project, you should allocate funds to carry this data forward to new projects. As per federal regulations, you cannot pay for storage in advance. The Tier 3 tape service provides a location to deposit data longer term (7 years) which can meet many of the funding requirements,

Billing will be handled by Science Operation Core Facilities. You will be billed monthly for the TB allocation of space for each tier. Groups will have 2-3 business days to review the invoices before the charges are assessed via internal billing journals. By default, we will also provide you a usage report by user. A usage report per project can be available by request and is best setup for new projects with new allocations.
It is your and your finance admin's responsibility to update or verify your 33-digit billing code for monthly billing in the FIINE system. If no other billing codes are designated, your start-up fund will be used. We are here to help you navigate these decisions: Contact FASRC

For billing inquiries or issues, please email billing@rc.fas.harvard.edu

For general storage issues, questions, or tier changes, please contact rchelp@rc.fas.harvard.edu

We have moved away from owned servers. Very few exceptions will be made. If circumstances warrant one, the request will be reviewed by the University Research Computing Officer, Sr. Director of Science Operations and Administrative Dean of Science. One possible exception is when storage must be adjacent to an instrument where data collection rates are beyond the capacity of 1 Gbps Ethernet (100 MB/s) for extended periods (days).

We will maintain existing physical servers while under warranty, which is typically 5-6 years from their purchase date. We will need a data migration plan to the appropriate tiers a few months prior to decommissioning the server.

Over FY22 we will be migrating whole filesystems at a time into the storage service center. All new space requests will be allocated on newly deployed storage in one of the Tiers.

Most owned storage servers have already been phased out.

Information about the storage offerings can be found on our Storage Services page and Storage Service Center document .

Requests for storage allocations changes can be made through Coldfront (FASRC VPN required). Select your project after logging in and you will find a "Request Allocation Change" button beside each allocation listing.

We ask that you plan ahead for future needs rather than repeatedly adding small increments. Please limit your allocation change requests to no more than once every 60 days for a particular allocation. We have a cutoff date of 15th for billing, so changes requested after that date will not be reflected on your bill until the next billing cycle.

© The President and Fellows of Harvard College
Except where otherwise noted, this content is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International license.