1
|
- Takeshi Horinouchi (Kyoto Univ à Hokkaido Univ(soon)),
- Seiya Nishizawa (Kyoto Univ à Kobe Univ),
- Chiemi Watanabe (Ochanomizu Univ),
- T. Koshiro, A. Tomobayashi, S. Otsuka,
- Y. Morikawa, Y.-Y. Hayashi, M. Shiotani, and
- GFD Dennou Club (Davis project)
|
2
|
- = Geophysical fluid data navigator
- A suite of software to construct Web-based database of geophysical fluid
data
- Functionality:
- Search
- Data analysis and visualization
- Documentation of analysis results
- Available:
-
http://www.gfd-dennou.org/arch/davis/gfdnavi/
|
3
|
|
4
|
- Limited analysis capability
- à We often end up with downloading data
- Not very suitable to desktop use
- à Service are not available to local data
|
5
|
- Impossible to predefine sufficient functionality (since we are scientists)
- à Programmability is the key
- Programmability in two ways:
- Programmable on web-browser
- Web-service API (program locally)
- à Both are desirable
|
6
|
- To others (scientists / society): reports
- While working: memos / internal documents
- To collaborators: reports / know-how / discussion
|
7
|
|
8
|
- GPhys – a Ruby library to analyze and visualize geophysical fluid data (by
Horinouchi etc since 2003)
- For consolidated access to data in files (NetCDF, GRIB, GrADS, NuSDAS,
HDF5-EOS) or on runtime memory – A community infrastructure for data
analysis [http://ruby.gfd-dennou.org/]
(since 1999)
- Ruby on Rails – Development framework for Web application (since 2005)
- Made it drastically easy to develop Web applications with RDB
- Written in/for Ruby à We
can use GPhys directly
|
9
|
|
10
|
|
11
|
- Since we wanted a language for daily data analysis
- Easy (fast) to write
- Interactive use à like
GrADS
- Python is also fine (but we love Ruby)
|
12
|
|
13
|
|
14
|
- Metadata
- name-value attributes; with a few standard field names
- geospatial- and time-coordinate info
- size, user info etc
- Directory structure (inherit metadata from parent directories)
- Generated by automatic scan (with a command)
- variables: reading attributes through GPhys
- directories: directory name and “Readme”-type texts
|
15
|
|
16
|
|
17
|
|
18
|
|
19
|
|
20
|
|
21
|
|
22
|
|
23
|
|
24
|
|
25
|
|
26
|
|
27
|
|
28
|
|
29
|
|
30
|
|
31
|
|
32
|
|
33
|
|
34
|
|
35
|
|
36
|
-
à Tomorrow by Seiya
Nishizawa
|
37
|
- Under development by C Watanabe (Ochanomizu Univ)
- To create peer-to-peer network for cross search and cross use among
Gfdnavi servers
- Then one can access local data and remote data together
|
38
|
- Novel features of Gfdnavi
- Seamless coverage from desktop use to public data service (by having
custom web server)
- Programmability (on browser & by web service)
- Documentation of analysis results (dynamically reproducible/extendible)
(à memos / reports / PR / Blog
for scientific collaboration)
- Good implementation
- Extendibility (by using GPhys)
- Swift development (by using RonR)
|
39
|
- Support Networking à Create
a Web of scientific data & knowledge
- Increase analysis & visualization functionality (many needed)
- Improve remote API accesses (tomorrow’s topic)
|
40
|
|
41
|
|
42
|
|
43
|
|
44
|
- Web development framework in Ruby
- With RDBMS (Mysql, Postgres, SQL Server, SQLite etc)
- Strong prototyping (e.g. Model-View-Controller (MVC) stucture)
- Comprehensive library (covering Ajax and Web service)
- Ruby-embedded html
- à suitable to use our Ruby
library
- Has a private web server (Webrick); also runs on Apache, lighttpd etc
- à One can personally run a
web server anywhere with arbitrary port
|
45
|
|
46
|
|
47
|
|
48
|
|
49
|
|
50
|
- Takeshi Horinouchi (RISH, Kyoto Univ.)
|
51
|
- Virtual Array. A class of Ruby (written in pure ruby), which represents
array data in GPhys
- A VArray object behaves as an array, but its contents can be on various
media: (case 1) simply a multi dimensional array on memory (NArray), or data in a NetCDF file (in this
case, a file pointer is stored), or GrADS data; (case 2) It can also
represent a subset of another VArray or multiple VArrays tiled.
- Can have attributes as variables in NetCDF datasets
- In reality, NetCDF are handled by a subclass VArrayNetCDF etc.etc.
|
52
|
- Always kept direct by compositing mappings, in order to prevent long
chains (see the figure below).
- Subset slicing (by such as va[0..10,3]) is done by subset mapping, not
by making actual data extraction, if not explicitly specified otherwise.
Therefore,
- Computationally efficient
- Suitable for writing in subsets of data in files.
- In other words, actual data
cutting is deferred until needed – to defer operations until needed is a
policy of GPhys construction
|
53
|
- Consists of a grid (coordinates) and multi dimensional array data
- Can conduct mathematical operations (a GPhys behaves like an numeric
array)
|
54
|
- Variables that have same names as dimensions hold coordinate values
(locations)
- Weak point: this rule can be violated
|
55
|
- 3 cases are prepared
- point sampling
- cell type
- simple sequence (though it’s not physical)
|
56
|
- Data divided into “tiles” can be treated as one consolidated GPhys
object. Convenient to handle
long time sequence divided by periods (such as by years) or outputs from
parallel simulations on distributed-memory machines. Tiling is done by VArrayComposite.
- Subsets can be handled (see the figure below)
- May be applicable to parallel simulations in future?
- So far, automatic configuration is available only for NetCDF, by using
an Array or Regexp (e.g., /data_x(\d)y(\d).nc/ for data_x0y0.nc,
data_x0y1.nc, data_x1y0.nc, data_x1y1.nc)
|
57
|
- Iterator to handle data too big to read on memory at once.
- GPhys::IO.each_along_dims_write – the result also written in file
(since the result of operations is often big too.) Another type of iterator is
planned but yet to be implemented.
- Example:
- Without the iterator:
- in = GPhys::IO.open(infile, varname)
- ofile =
NetCDF.create(ofilename)
- out =
in.mean(0)
# now, the entire
result is on memory
-
GPhys::IO.write( ofile, out )
- ofile.close
- With the iterator, taking the last dimension to make a loop:
- in = GPhys::IO.open(infile, varname)
- ofile =
NetCDF.create(ofilename)
- out =
GPhys::IO.each_along_dims_write(in, ofile, -1){ |in_sub|
-
[ in_sub.mean(0) ]
# written in ofile
each time
-
}
- ofile.close
|
58
|
- Handled by NumRu::Units (by E Toyoda)
- mlt,div,etc.: handled as should be
- add,sub: the units of the first term is inherited
- e.g., addition of [m] and [km] is done after multiplying the second
term by 1000. Warning is
made if the units are incompatible (in that case, no conversion is
made).
- Introduced a scalar numeric class with units UNumeric
- GPhys, VArray, and UNumeric recognize one another (stronger to weaker
in this order)
- Example: to multiply the Coriolis parameter with a GPhys object u
representing winds [m/s]:
- f =
UNumeric[1e-4,”s-1”]
- coriolis_frc =
f * u
# then the units will be in m.s-2
|
59
|
- Data service to remote clients
- gphys-remote: a simple directory service (like the anonymous ftp,
directories and data (in which GPhys objects can be defined) under a
top directory is made accessible to remote hosts.
- gave (GUI): can connect to gphys-remote server
|