Data flow diagrams. DFD methodology - Educational and scientific activities of Vladimir Viktorovich Anisimov Dfd construction rules

Material from PIE.Wiki

DFD (Data Flow Diagramming) is a modeling standard in which the system is represented as a network of works connected by objects that interact with the results of these works. The scope of DFD is in the area of modeling the information flows of an organization. This notation models not a sequence of jobs, but rather information (data) flows between jobs and objects that use, store or “give birth” to this data.

In accordance with the DFD (Data Flow Diagram) methodology, the system model is defined as a hierarchy of data flow diagrams that describe the processes of transforming information from the moment it is entered into the system until it is issued to the end user. Diagrams of the upper levels of the hierarchy - context diagrams, set the boundaries of the model, defining its environment (external inputs and outputs) and the main processes under consideration. Context diagrams are detailed using the following levels of diagrams.

The main elements of data flow diagrams are:

external entities;

processes;

data storage devices;

data streams.

External Entities

External Entity is a material object that is a source or receiver of information. Customers, suppliers, clients, warehouse, bank and others can act as an external entity on a DFD diagram. Unfortunately, the DFD methodology is not formalized as a standard. For this reason, data flow diagrams use different conventions. Figure 1 shows the external entity symbols used in Yourdon and Coad Process Notation and Gane and Sarson Process Notation.

Defining an object as an external entity indicates that it is located outside the boundaries of the analyzed information system.

Processes

Processes represent the transformation of input data streams into output streams in accordance with a specific algorithm. In real life, the process can be performed by some department of the organization that processes input documents and issues reports, by an individual employee, by a program installed on a computer, by a special logical device, and the like.

Data storage

Data storage devices are intended to represent certain abstract devices for storing information that can be placed or retrieved there at any time, regardless of their specific physical implementation.

Data storage devices are a kind of prototype of an organization's information system database.

Inside the symbol, its name, unique within the framework of this model, is indicated, most accurately, from the analyst’s point of view, reflecting the information essence of the content, for example, “Suppliers”, “Customers”, “Invoices”, “Invoices”. Data storage symbols may contain serial numbers as additional identification elements.

Data Streams

A data stream defines information transmitted through some connection (cable, postal service, courier) from a source to a receiver. In DFD diagrams, data flows are represented by lines with arrows showing their direction. Each data stream is given a name that reflects its content.

The following types of objects are used in DFD: · activity - a synonym for work in IDEF0 and IDEF3; · external entity - objects - sources/recipients of information/data changed or used in a given function. · arrow (data flow) - designation of information (data) flow; · data store - any mechanism or abstraction (for example, a record in a database) in which data is stored. Data flows between jobs in DFD are possible not only indirectly, through data warehouses, but also directly between jobs if the data does not first arrive in the warehouse. IDEF0, IDEF3 and DFD notations can be consistently used for more and more in-depth development of the organization's model, the final stage of which can be a detailed description of the organization's business processes and information system

Building a hierarchy of data flow diagrams

The first step in constructing a DPD hierarchy is to construct context diagrams. Typically, when designing relatively simple ICs, a single context diagram is built with a star topology, in the center of which is the so-called main process, connected to the sinks and sources of information through which users and other external systems interact with the system.

If for a complex system we limit ourselves to a single context diagram, then it will contain too many sources and receivers of information that are difficult to arrange on a sheet of normal-sized paper, and in addition, the single main process does not reveal the structure of the distributed system. Signs of complexity (in terms of context) can be: the presence of a large number of external entities (ten or more); distributed nature of the system; multifunctionality of the system with an already established or identified grouping of functions into separate subsystems.

For complex IS, a hierarchy of context diagrams is built. At the same time, the top-level context diagram contains not a single main process, but a set of subsystems connected by data flows. The next level of context diagrams details the context and structure of subsystems.

The hierarchy of context diagrams determines the interaction of the main functional subsystems of the designed IS both among themselves and with external input and output data streams and external objects (sources and receivers of information) with which the IS interacts.

The development of context diagrams solves the problem of strictly defining the functional structure of an IS at the earliest stage of its design, which is especially important for complex multifunctional systems in the development of which different organizations and development teams participate.

After constructing context diagrams, the resulting model should be checked for completeness of initial data about system objects and isolation of objects (absence of information connections with other objects).

For each subsystem present on the context diagrams, it is detailed using DPD. Each process on a DPD, in turn, can be detailed using a DPD or a mini-specification. When detailing, the following rules must be followed: balancing rule - means that when detailing a subsystem or process, the detailing diagram as external sources/receivers of data can only have those components (subsystems, processes, external entities, data storage devices) with which the detailing has an information connection subsystem or process on the parent diagram; numbering rule - means that when detailing processes, their hierarchical numbering must be maintained. For example, processes detailing process number 12 are given numbers 12.1, 12.2, 12.3, etc.

The mini-specification (description of the process logic) should formulate its main functions in such a way that in the future the specialist implementing the project will be able to carry them out or develop an appropriate program.

The mini-specification is the final top of the DPD hierarchy. The decision to complete the detailing of the process and use the mini-specification is made by the analyst based on the following criteria: the presence of a relatively small number of input and output data streams for the process (2-3 streams); the ability to describe data transformation by a process in the form of a sequential algorithm; the process performs a single logical function of converting input information into output information; the ability to describe the process logic using a small mini-specification (no more than 20-30 lines).

When building a DPD hierarchy, you should proceed to detailing the processes only after determining the content of all flows and data drives, which is described using data structures. Data structures are constructed from data elements and can contain alternatives, conditional occurrences, and iterations. Conditional occurrence means that a given component may not be present in the structure. Alternative means that the structure may include one of the listed elements. Iteration means entering any number of elements in a specified range. For each data element, its type (continuous or discrete data) can be specified. For continuous data, the unit of measurement (kg, cm, etc.), range of values, accuracy of presentation, and form of physical coding may be specified. For discrete data, a table of acceptable values can be specified.

After building a complete system model, it must be verified (checked for completeness and consistency). In a complete model, all its objects (subsystems, processes, data flows) must be described and detailed in detail. Identified non-detailed objects should be detailed by returning to the previous development steps. In a consistent model, all data flows and data storage devices must follow the information retention rule: all data arriving somewhere must be read, and all data read must be written.

General provisions

DFD is a generally accepted abbreviation for English. Data Flow Diagrams - data flow diagrams. This is the name of the graphical structural analysis methodology, which describes data sources and destinations external to the system, logical functions, data flows and data stores that are accessed.

Data flow diagram (DFD) (Fig. 2.1.) is one of the main tools for structural analysis and design of information systems that existed before the widespread use of UML. Despite the shift in emphasis in modern conditions from a structural to an object-oriented approach to the analysis and design of systems, “old” structural notations are still widely and effectively used both in business analysis and in the analysis of information systems.

Fig.2.1. Data flow diagram.

Historically, two notations have been used to describe DFD diagrams - Yourdon and Gane-Sarson, which differ in syntax. The illustration below uses Gein-Sarson notation.

The information system receives data streams from the outside. To designate the elements of the system's operating environment, the concept of an external entity is used. Within the system, there are processes of information transformation that generate new data flows. Data streams can be input to other processes, placed (and retrieved) into data storage, and transmitted to external entities.

The DFD model, like most other structural models, is a hierarchical model. Each process can be decomposed, that is, divided into structural components, the relationships between which in the same notation can be shown in a separate diagram. When the required depth of decomposition is achieved, the lower-level process is accompanied by a mini-specification (text description).

In addition, the DFD notation supports the concept of a subsystem - a structural component of the system being developed.

DFD notation is a convenient tool for generating a context diagram, that is, a diagram showing the developed AIS in communication with the external environment. This is the top level diagram in the DFD diagram hierarchy. Its purpose is to limit the scope of the system, to determine where the system being developed ends and the environment begins. Other notations often used when forming a context diagram are SADT diagram, Use Case Diagram.

To solve the problem of functional modeling based on structural analysis, two types of models are traditionally used: IDEF0 diagrams and data flow diagrams.

The methodology for developing process diagrams is usually used when conducting surveys of enterprises as part of management consulting projects, as well as in automation projects for large facilities during express surveys (usually to draw up a detailed work plan).

The notation of data flow diagrams allows you to display on the diagram both the steps of a business process and the flow of documents and control (mainly control, since at the top level of description of process areas the transfer of control is important). You can also display automation tools for business process steps on the diagram. Typically used to display the third and lower levels of business process decomposition (the first level is the identified list of business processes, and the second is the functions performed within the business processes).

Data flow diagramming (DFD):

· are the main means of modeling functional requirements for the system being designed;

· created to simulate the existing process of information flow;

· used to describe document flow and information processing;

· used as an addition to the IDEFO model for a more visual display of current document flow operations (information exchange);

· provide analysis and determination of the main directions of IS reengineering.

DFD diagrams can complement what is already reflected in the IDEF0 model, since they describe data flows, allowing you to trace how information is exchanged both within the system between business functions, and the system as a whole with the external information environment

If there is a software/programmable part in the simulated system (almost always), preference is usually given to DFD for the following reasons.

1. DFD diagrams were created as a tool for designing software systems, while IDEF0 was created as a tool for designing systems in general, so DFDs have a richer set of elements that adequately reflect their specifics (for example, data warehouses are prototypes of files or databases).

2. The presence of mini-specifications of lower-level DFD processes allows us to overcome the logical incompleteness of IDEF0, namely the break of the model at some fairly low level, when its further detailing becomes meaningless, and to build a complete functional specification of the system being developed.

3. Algorithms for automatically converting the DFD hierarchy into structural maps exist and are supported by a number of CASE tools, demonstrating inter-system and intra-system connections, as well as the hierarchy of systems, which, together with mini-specifications, is a complete task for the programmer.

Using DFD diagrams, the requirements for the designed IS are divided into functional components (processes) and presented as a network connected by data flows. The main purpose of DFD function decomposition is to demonstrate how each process transforms its inputs into outputs, as well as to reveal the relationships between these processes. Business process diagrams display:

· process functions;

· incoming and outgoing information when describing documents;

· external business processes described in other diagrams;

· break points when the process moves to other pages.

If, when modeling using the IDEF0 methodology, the system is considered as a network of interconnected functions, then when creating a DFD diagram, the system is considered as a network of interconnected functions, i.e. as a collection of entities (objects).

Structural analysis is a systematic, step-by-step approach to analyzing requirements and designing specifications for a system, whether existing or newly created. The Gane-Sarson and Yourdon/DeMarco data flow diagramming methodologies, based on the idea of top-down hierarchical organization, best demonstrate this approach.

The goal of these two methodologies is to transform general, unclear knowledge about system requirements into precise (as possible) definitions. Both methodologies focus on data flows and their main purpose is to create graphics-based functional requirements documents. The methodologies are supported by traditional top-down design methods and provide one of the best ways to communicate between analysts, developers and system users by integrating the following tools:

· Data flow diagrams.

· Data dictionaries, which are catalogs of all data elements present in the DFD, including group and individual data flows, stores and processes, and all their attributes.

· Processing mini-specifications, which describe low-level DFD processes and are the basis for code generation.

Mini specification.

A minispecification is an algorithm for describing tasks performed by processes; the set of all minispecifications is a complete specification of the system. Minispecifications contain the number and/or name of the process, lists of input and output data, and the body (description) of the process, which is a specification of an algorithm or operation that transforms input data streams into output ones. A large number of different methods are known that allow you to specify the body of a process; the corresponding language can vary from structured natural language or pseudocode to visual design languages (such as FLOW forms and Nussie-Schneiderman diagrams) and formal computer languages.

Design specifications are built using DFDs and their mini-specifications automatically. Most often, the Jackson structure map technique is used to describe design specifications, illustrating the hierarchy of modules, the connections between them and some information about their execution (sequence of calls, iteration). There are a number of methods for automatically converting DFDs into structure maps.

The main distinguishing feature of the Gain-Sarson methodology is the presence of a data modeling stage that determines the contents of data stores (DBs and files) in DFD in third normal form. This stage involves constructing a list of data elements located in each data store; analyzing the relationships between data and constructing a corresponding diagram of connections between data elements; presentation of all information on the model in the form of related normalized tables. In addition, the methodologies differ in purely syntactic aspects, for example, the graphical symbols representing the components of the DFD are different.

The methods discussed are methods that help you move from a blank sheet of paper or screen to a well-organized model of the system. Both methodologies are based on the simple concept of a top-down, step-by-step breakdown of system functions into subfunctions:

At the first stage, a top-level context diagram is formed that identifies the boundaries of the system and defines the interfaces between the system and the environment.

After interviewing a subject matter expert, a list of external events to which the system must respond is generated. For each of these events, an empty process (bubble) is built on the assumption that its function provides the required response to this event, which in most cases includes the generation of output streams and events (but may also include entering information into the data store for use by other events and processes).

At the next level of detail, similar activities are performed for each of the empty processes.

To enhance functionality, this diagram notation provides specific elements designed to describe information and document flows, such as external entities and data stores.

Basic symbols of DFD diagrams according to these notations:

Rice. 3.1. Basic symbols of DFD diagrams

In addition to the Jordan/De Marco and Hein-Sarson notations, other notations (OMT, SSADM, etc.) can be used for elements of DFD diagrams. They all have almost the same functionality and differ only in details.

Despite the fact that the IDEF0 methodology has become widespread, according to many analysts, DFD is much more suitable for the design of information systems in general and databases in particular. DFD allows you to determine basic data requirements already at the functional modeling stage (this is facilitated by the division of data flows into material, information and control). In addition, the integration of DFD models and ER models (entity-relationship, “entity-relationship”) does not cause difficulties. For example, you can define a list of attributes of data warehouses, the latter at the information modeling stage are uniquely displayed in the entity of the entity-relationship model.

In turn, as already noted, IDEF0 is more suitable for solving problems related to management consulting (process reengineering). This is also facilitated by the close connection of IDEF0 with the ABC (Activity Based Costing) method of functional cost analysis, which allows one to determine a scheme for calculating the cost of performing a particular business procedure. However, there are a number of CASE systems that offer the IDEF0 methodology at the stage of functional examination of the subject area. In such systems, simply a list of all IDEF0 model objects (inputs, outputs, mechanisms, control) is passed to the next stage, which are then considered for inclusion in the information model.

2.2.4.3. DFD Notation Terminology.

DFD BLOCKS – a graphical representation of an operation (process, function, work) for processing or converting information (data). The meaning of the DFD block displaying the function coincides with the meaning of the IDEFO and IDEF3 blocks, which is to convert inputs into outputs. DFD blocks also have inputs and outputs, but do not support controls and mechanisms like IDEFOs.

The purpose of the function is to create output streams from input streams in accordance with the action specified by the process name. Therefore, the function name must contain a verb in the indefinite form followed by an object. Functions are usually named by the name of the system, for example "CAD Development". It is recommended to use verbs that reflect dynamic relationships, for example: "calculate", "get", "order", "mill", "turn", "calculate", “enable”, “model”, etc. If the author uses verbs such as “process”, “upgrade”, or “edit”, then this means that he probably does not yet understand this process function deeply enough and needs to further analysis.

According to Gain-Sarson notation, a DFD block is depicted as a rectangle with rounded corners. Each block must have a unique number for reference within the diagram. Each block number may include a prefix, a parent block number (A), and an object number, which is the unique number of the block in the diagram. For example, the function may be numbered A.12.4.

In order to avoid intersections of data flow lines, the same element can be displayed several times on the same diagram; in such a case, two or more rectangles representing the same element can be identified by a line crossing the lower right corner.

DATA FLOW (data flow) is a mechanism used to model the transfer of information between participants in the information exchange process (functions, data stores, external links). In Gain-Sarson notation, a flow of data (documents, objects, employees, departments or other participants in information processing) is depicted by an arrow between two DFD diagram objects, preferably horizontal and/or vertical, with the direction of the arrow indicating the direction of flow. Every arrow must have a source and a target. Unlike IDEF0 diagram arrows (ICOM), DFD arrows can enter or exit either side of the block.

Arrows describe how objects (including data) move from one part of the system to another. Since in DFD each side of a block does not have a clear purpose, unlike the blocks in an IDEF0 diagram, arrows can come in and out of any face. In DFD diagrams, to describe command-response type dialogs between operations, bidirectional arrows are used between a function and an external entity and/or between external entities. Arrows can merge and branch, which makes it possible to describe the decomposition of arrows. Each new segment of a merging or branching arrow can have its own name.

Sometimes information can move in one direction, be processed, and come back. This situation can be modeled either by two different flows, or by one bidirectional one. You can reference a data flow by identifying the processes, entities, or data drives that the flow connects.

Each stream should have a name along or above the arrow, chosen to best convey the meaning of the stream's contents to users viewing the data flow diagram. When sketching a data flow diagram, you can omit the names if they are obvious to the user, but the author of the diagram should always provide a description of the flow.

DATA FLOW DIAGRAM (DFD diagram) (Fig. 4.1.) – diagrams used for graphical representation (flowchart) of the movement and processing of information in an organization or in any process. Typically, diagrams of this type are used to analyze the organization of information flows and for the development of information systems. DFD diagrams are a key part of the requirements specification document - graphical hierarchical specifications that describe the system in terms of data flows. Each process node in DFD can be expanded into a lower-level diagram, which allows you to abstract from details at any level. Fig.4.1. Example of a DFD data flow diagram.

This type of diagram is usually abbreviated as DFD. DFDs are. A DFD may include four graphical symbols representing data streams, processes for transforming input data streams into output streams, external data sources and destinations, and the files and databases required by the processes for their operations.

DFD diagrams model the functions that a system must perform, but communicate almost nothing about the relationships between data or the behavior of the system over time - for these purposes entity-relationship diagrams and state transition diagrams are used, respectively.

DATA STORE (Fig. 4.2.) – graphical representation of data flows imported/exported from the corresponding databases. Typically these are tables for storing documents. Unlike arrows that describe objects in motion, data warehouses depict objects at rest. Data storage devices are a kind of prototype of an organization's information system database. Data warehouses are included in the system model if there are stages in the process cycle at which data appears that needs to be stored in memory. When displaying the process of saving data, the data flow arrow is directed to the data storage, and, vice versa, from the storage if data is being imported.

Fig.4.2. Data store.

Data warehouses are intended to represent certain abstract devices for storing information that can be placed or retrieved there at any time, regardless of their specific physical implementation. Data warehouses are used:

in material systems - where objects are waiting to be processed, for example in a queue;

in information processing systems to model mechanisms for storing data for further operations.

In Gain-Sarson notation, a data warehouse is denoted by two horizontal lines closed at one end. Each data store should be identified for reference by the letter D and an arbitrary number in the square on the left side, for example D5. The name should be selected taking into account the greatest information content for the user.

A model can have multiple data store occurrences, each of which can have the same name and reference number. In order not to complicate the data flow diagram with intersections of lines, you can depict duplicate data storage devices with additional vertical lines on the left side of the square.

EXTERNAL REFERENCE (external link, external entities) (Fig. 4.3.) – a data flow diagram object that is a source or receiver of information from outside the model. External links/entities represent inputs and/or outputs, i.e. provide an interface with external objects located outside the modeled system. External references of the system are usually logical classes of objects or people that represent the source or receiver of messages, for example, customers, designers, technologists, production services, storekeepers, etc. These can be specific sources, such as accounting, an information retrieval system, a regulatory control service, a warehouse. If the system in question receives data from another system or transmits data to another system, then that other system is an element of the external system. Without an “external entity” object, it is sometimes difficult for an analyst to determine where the company received these documents from. Or what other documents come from such an external entity as, for example, “client”.

Fig.4.3. External entity.

In Gain-Sarson notation, an xref icon is a shaded top-left rectangle that is double-thick to distinguish it from other icons on the diagram, and is usually located at the edges of the diagram. An external link can be identified by a lowercase E in the upper left corner and a unique number, such as E5. Additionally, the external link has a name.

When considering a system as an external function, it is often indicated that it is outside the boundaries of the system being modeled. After the analysis, some external links can be moved inside the data flow diagram of the system under consideration or, conversely, some part of the system’s functions can be taken out and considered as an external link.

When interpreting a DFD diagram, the following rules are used:

· functions transform incoming data streams into output ones;

· data warehouses do not change data flows, but serve only to store incoming objects;

· data stream transformations in external links are ignored.

In addition, for each information flow and store, associated data elements are defined. Each data element is given a name and may also have a data type and format. It is this information that is the starting point for the next design stage - building an “entity-relationship” model. In this case, as a rule, information warehouses are converted into entities; the designer can only resolve the issue using data elements not associated with warehouses.

Representing flows as arrows together with data stores and external entities makes DFD models more similar to the physical characteristics of the system - the movement of objects, the storage of objects, the delivery and distribution of objects.

Constructing diagrams.

DFD diagrams can be constructed using traditional structural analysis, similar to how IDEFO diagrams are constructed:

· a physical model is built that reflects the current state of affairs;

· the resulting model is converted into a logical model that reflects the requirements for the existing system;

· a model is built that reflects the requirements for the future system;

· a physical model is built, on the basis of which a new system should be built.

An alternative approach is a software approach called event partitioning, in which various DFD diagrams build a model of the system:

· a logical model is built as a set of processes and documentation of what these processes should do;

· using the environment model, the system is described as an object interacting with events from external entities. An environment model typically contains a description of the system's purpose, one context diagram, and a list of events. The context diagram contains one block depicting the system as a whole, the external entities with which the system interacts, links and some arrows imported from IDEF0 and DFD diagrams. Including external references in a context diagram does not replace the methodology's requirement to clearly define the purpose, scope, and common point of view of the system being modeled;

A behavior model shows how the system processes events. This model consists of a single diagram in which each block depicts each event from the environment model, and stores can be added to model the data that needs to be remembered between events. Threads are added to communicate with other elements, and the diagram is checked against the environment model.

The resulting diagrams can be transformed to provide a more visual representation of the system; in particular, functions can be decomposed.

An example of DFD diagrams using the Hein-Sarson notation for an enterprise that builds its activities on the “make-to-order” principle is shown in Figure 5.1.

Based on received orders, a product release plan for a certain period is formed. In accordance with this plan, the need for components and materials is determined, as well as the loading schedule for production equipment. After manufacturing the products and making payments, the finished products are sent to the customer.

Orders are subject to incoming inspection and sorting. If the order does not meet the product range or is placed incorrectly, it is canceled with appropriate notification to the customer. If the order is not cancelled, it is determined whether the corresponding product is in stock. If the answer is positive, an invoice for payment is issued and presented to the customer; upon receipt of payment, the goods are sent to the customer. If the order is not provided with warehouse stocks, then a request for the product is sent to the manufacturer. After the required goods arrive at the company's warehouse, the order becomes secured and repeats the route described above.

Fig.5.1. Example of DFD diagrams using Hein-Sarson notation for an enterprise

This diagram represents the topmost level of the functional model. Naturally, this is a very rough description of the subject area. The model is refined by detailing the necessary functions on the next level DFD diagram. So we can break down the function “Identification of needs and provision of materials” into the subfunctions “Identification of needs”, “Search for suppliers”, “Conclusion and analysis of supply contracts”, “Payment control”, “Supply control”, connected by their own data flows that will presented in a separate diagram. The model should be refined until it contains all the information necessary to build an information system.

The advantages of the DFD technique include:

· the ability to uniquely identify external entities by analyzing information flows inside and outside the system;

· the ability to design from top to bottom, which facilitates the construction of a “as it should be” model;

· the presence of specifications of lower-level processes, which allows you to overcome the logical incompleteness of the functional model and build a complete functional specification of the system being developed.

The disadvantages of the model include:

· the need for artificial input of control processes, since control actions (flows) and control processes from the point of view of DFD are no different from ordinary ones;

· absence of the concept of time, i.e. lack of analysis of time intervals when converting data (all time restrictions must be entered in the process specifications).

Bibliography:

1. Andreychikov A.V. Andreychikova O.N. Intelligent information systems Publishing house. "Finance and Statistics" Moscow 2004 422s.

2. Anisimov B.P., Kotov V.V. “Modern methodologies for structural analysis and design of information processing systems” magazine “Software Products and Systems” No. 2, 1997. [06.24.1997]

3. Kozlenko L. “Design of information systems. Part 1. Stages of project development: strategy and analysis" ComputerPress magazine, 9" 2001.

4. Mark D.A. McGowack K. SADT methodology of structural analysis and design ed. Metatechnology, M. 1993.

5. Vendrov A.M. CASE technologies modern methods and tools for design and systems ed. Finance and Statistics M. 1998

Internet resources:

http://www.aiportal.ru/

http://www.itstan.ru/

http://www.intuit.ru/

SADT technology

Introduction

SADT (Structured Analysis and Design Technique) is one of the most famous methodologies for analysis and design of systems, introduced in 1973 by Ross. SADT has been successfully used in military, industrial and commercial organizations to solve a wide range of problems such as telephone network software, system support and diagnostics, long-range and strategic planning, computer-aided manufacturing and design, computer system configuration, personnel training, defense embedded software . financial and logistics management, etc. This methodology is widely supported by the US Department of Defense. which was the initiator of the development of the IDEF0 standard as a subset of SADT. This, along with growing automated support, has made it more accessible and easier to use.

From a SADT perspective, a model can be based either on the functions of the system or on its objects (plans, data, equipment, information, etc.). The corresponding models are usually called activity models and data models. The activity model represents, with the required degree of detail, a system of activities, which in turn reflect their relationships through the objects of the system. Data models are dual to activity models and represent a detailed description of system objects related by system activities. The complete SADT methodology is to build both types of models to more accurately describe a complex system. However, at present only activity models have found widespread use, and this section is devoted to their consideration.

SADT diagrams

The main working element in modeling is the diagram. The SADT model aggregates and organizes diagrams into hierarchical tree structures, with the higher the level of the diagram, the less detailed it is. The diagram includes blocks that depict the activities of the modeled system, linking the blocks together, and depicting the interactions and relationships between the blocks. SADT requires a diagram to have 3-6 blocks: within these limits, diagrams and models are easy to read, understand, and use. Instead of one cumbersome model, several small interconnected models are used, the meanings of which complement each other, making the structuring of a complex object clear. However, such a strict requirement for the number of blocks on the diagram limits the use of SADT for a number of subject areas. For example, in banking structures there are 15-20 equal activities that would be advisable to be reflected in one diagram. Artificially dispersing them into different levels of the SADT model clearly does not improve its understandability.

Block structure

The blocks in the diagrams are depicted as rectangles and are accompanied by texts in natural language describing the activities. Unlike other methods of structural analysis in SADT, each side has a very specific special purpose: the left side of the block is intended for Inputs, the top - for Control, the right - for Outputs, the bottom - for Executors. This designation reflects certain principles of activity: Inputs are converted into Outputs , Controls limit or prescribe conditions for execution, Performers describe how transformations are carried out.

Arcs in SADT represent sets of items and are labeled with natural language texts. Objects can consist of activities in four possible relationships: Input, Output, Control, Performer. Each of these relationships is represented by an arc associated with a specific side of the block - thus the sides of the block purely graphically sort the objects represented by the arcs. Input arcs represent objects used and transformed by activities. Control arcs typically depict information that controls the actions of activities. The output arcs represent the objects into which the inputs are converted. Performing arcs reflect (at least in part) the implementation of activities.

The blocks on the diagram are placed in a “stepped” pattern in accordance with their dominance, which is understood as the influence exerted by one block on the others. In addition, blocks should be numbered, for example, according to their dominance. Block numbers serve as unique identifiers for activities and automatically organize these activities into a model hierarchy.

The mutual influence of blocks can be expressed either in sending the Output to another activity for further conversion, or in the generation of control information that prescribes what exactly the other activity should do. Thus, SADT diagrams are prescriptive diagrams, describing both the transformations between Input and Output and the prescriptive rules for these transformations.

Relationships

SADT requires only five types of relationships between blocks to describe their relationships: Control, Input, Control Feedback, Input Feedback, Output - Executor. Control and Input relationships are the simplest because they reflect intuitively obvious direct influences. A Control Relationship occurs when the Output of one block directly affects a less dominant block. An Input relationship occurs when the Output of one block becomes the Input of a less dominant block. Feedbacks are more complex because they reflect iteration or recursion - Outputs from one activity affect the future execution of other functions, which subsequently affects the original activity. Control Feedback occurs when the Output of some block influences a block with greater dominance, and an Input Feedback relationship occurs when the Output of one block becomes the Input of another block with greater dominance. The Exit-Executor relationship is rare and is of particular interest. They reflect a situation in which the Output of one activity becomes a means of achieving the goal of another activity.

IT standards

In the comments to one of my previous articles on IDEF0, one of the users asked to tell me more about what DFD is. The concept is somewhat confusing, many of my clients also ask questions about data flows and charting standards. That’s why I decided to dedicate this article to DFD.

DFD is a generally accepted abbreviation for English. data flow diagrams - data flow diagrams. This is the name of the graphical structural analysis methodology, which describes data sources and destinations external to the system, logical functions, data flows and data stores that are accessed. Data flow diagram (DFD) is one of the main tools for structural analysis and design of information systems that existed before the widespread use of UML. Wikipedia

In my opinion, the definition from the Russian-language Wikipedia is somewhat overloaded with information and, as a result, unnecessarily difficult to understand. Also, I personally believe that DFD and UML are different tools, and therefore it is incorrect to say that DFD is simply a predecessor to UML.

For myself, I came up with the following formulation:

DFD is a notation designed to model information systems from the point of view of data storage, processing and transmission.

Why do we need DFD notation?

Historically, the syntax of this notation has been used in two versions - Yourdon and Gane-Sarson. The differences between them are in the table below:

I myself use only one of the options, according to Hein and Sarson. But when I was researching the material before writing this article, I saw this comparison table. I believe that it is important not so much for choosing a syntax option, it will depend more on the choice of software for creating notations and your personal preferences, but as a clear illustration of the fact that DFD does not have a rigid syntax, as, for example, in BPMN. There are different options you can use here, the main thing is that they are clear to you and your clients. DFD notation is a convenient tool for creating ad-hoc diagrams that can be done quickly and with maximum freedom.

This type of notation is used when a description of the system as a data warehouse is required. Those. the notation should clearly answer the questions:

What does an information system consist of?
What does it take to process information?

The DFD notation itself consists of the following elements:

Process, i.e. a function or sequence of actions that must be taken for data to be processed. This could be creating an order, registering a client, etc. It is customary to use verbs in process names, i.e. “Create a customer” (not “create a customer”) or “process an order” (not “post an order”). There is no strict system of requirements, as, for example, in IDEF0 or BPMN, where notations have a strictly defined syntax, since they can be executable. But still, certain rules should be followed so as not to cause confusion when other people read the DFD.
External Entities. These are any objects that are not included in the system itself, but are a source of information for it or recipients of any information from the system after data processing. This could be a person, an external system, any storage media or data storage.
Data store. Internal data storage for processes in the system. The received data before processing and the result after processing, as well as intermediate values, must be stored somewhere. These are databases, tables or any other option for organizing and storing data. Customer data, customer requests, invoices and any other data that entered the system or are the result of processing processes will be stored here.
Data flow. The notation is displayed in the form of arrows that show what information is included and what information comes out of a particular block on the diagram.

DFD notation can describe any action, including the process of selling or shipping goods, working with requests from customers or purchasing materials, from the point of view of describing the system. This notation helps to understand what the system should consist of and what is needed to automate a business process. But the DFD is not a description of the business process itself. Here, for example, there is no such important parameter as time. Also, this notation does not provide conditions and “forks”. In DFD, we look at where the data comes from, what data is needed, how it is processed, and where the results should be sent. Those. This notation describes not so much the process itself as the movement of data streams. To work with processes, I recommend using BPMN or IDEF3 (I'll talk about that another time).

How to Create DFD Notations

Let's take a look at sales automation notation as an example. Let's say we have a client who makes an application through the website or by phone. There is a manager who registers this application. Thus, data appears in the system - the client and his order. The warehouse employee must see this and ship the goods with all the necessary documents and hand over the documents to the client.

The sequence looks like this:

The client provides his data and application.
The manager checks and enters the received data into the system.
A warehouse worker generates documents, for example, an invoice, and ships the goods.
The client receives the goods and a package of documents for it.

We need to see this sequence of actions from the point of view of storing data and working with it in the IT system.

From a DFD perspective we have:

The buyer is an external entity that is the source of data and the recipient of the result.
Order processing process (confirmation and posting of data in the system by the manager).
Collection of the order at the warehouse (after receiving the application).
Registration of shipment (creation of necessary documents).

What rules do you need to know to create a DFD diagram:

Every process must have at least one input and one output. The meaning of the processes here is to process data, and therefore the process must receive data (incoming arrow) and give it somewhere after processing (outgoing arrow);
The data processing process must have an external incoming arrow (data from an external entity). In order for any such process to start working, it is not enough to use data from the storage; new information must arrive for subsequent processing;
Arrows cannot directly connect data stores; all connections go through processes. There is no point in simply moving data from one place to another, and this is how the direct connection of two storages is read with an arrow. The data is received in order for some actions to be carried out, in our example, the sales process is carried out. And this is only possible through processing (process);
All processes must be associated either with other processes or with other data stores. Processes do not exist on their own, and therefore the result must be transmitted somewhere;
Decomposition. DFD diagrams provide the ability to create large processes and decompose them into subprocesses with a detailed description of actions. For example, we can create a process of “creating an application”, which can then be decomposed into a sequence of actions, for example, to receive an application, separately – checking and obtaining customer data; if a product in an online store is sold to order, then also when creating an application you will need to obtain data from the supplier about the availability of the required items, etc. And then on the top diagram we will have the “application processing” block, and when decomposed we will get a diagram with a detailed sequence of actions at this stage. At the same time, at no stage will we have conditions and branches. There will be a process and its decomposition up to 3-4 levels deep.

What the diagram will look like (without decomposition, top level):

And the decomposition of the main element of our diagram:

Where are DFD notations used?

DFD diagrams are actively used in software development. Wherein:

data warehouses are spreadsheets and databases,
external entities – clients or other databases, including those from other programs (integration and data exchange),
processes are the functions and modules performed in the system.

DFD notations are also convenient for analysis when the system is considered from the point of view of document flow. At the same time, you can clearly see where the data is stored, how documentation is exchanged, where errors in organizing business processes were made in this process, etc. But here the use of DFD diagrams requires special caution. However, this is not a description of a business process as such, but rather a diagram of data movement during the implementation of business processes. But as an auxiliary option, including for visually demonstrating to the client existing problems and methods for optimizing work, this type of notation is quite suitable.

For example, to identify document flow problems, duplication of documents, or, conversely, missing documentation or electronic data in the system, it is very convenient to create a separate description of the business process, and then a DFD notation for it. Or vice versa, a DFD notation is first created to understand the basics of business and the features of document flow implementation. It helps to identify, for example, the absence of important documents in the automation system that are actually created (on paper), but are not displayed in the system in any way. And then an optimized business process is built, taking into account the identified nuances of document flow.

DFD notations made easy!

I believe that DFD notation is really much simpler than it seems at first glance. The main thing is to clearly understand the limitations of constructing this type of diagram (lack of conditions, time, etc.) and apply them where exactly this approach will be more convenient. Perhaps you will find your own uses for DFD that I did not describe above. My list contains only those options that I use in practice.

What’s especially convenient about DFD notations is that you don’t have to adhere to strict rules and syntax, as, for example, in BPMN. These notations will not be executable; they are needed to understand the features of document flow, structure and subsequent work with data. Therefore, if your diagram is clear to both you and the customer, some deviations from DFD standards are quite acceptable.

In principle, you can draw DFD diagrams wherever and however you prefer. But if you want to work with decomposition, build a system at different levels of detail, then you will have to forget the “drawing tools” (Visio, Paint and the like). You will need specialized modeling programs.

Personally, I use ERwin and recommend it to everyone. One of the reasons for my choice is the features of decomposition. In ERwin, as in some other similar systems, it is possible to decompose DFD processes in the IDEF3 format, i.e. the main diagram will be in DFD format, and at the most general level you will see the main data flows and the “nodes” of their processing. And with decomposition, you can use a process approach, which can also be very convenient for developing large systems or working with different business departments.

Questions and answers

What is the difference between DFD and UML?

There is a notation language called UML, which also positions itself as a data-driven notation. But at the same time, UML is already a programming language; it has strict syntax and requirements, but there are also many more possibilities for describing various functions. DFD is a notation that is used more freely and is more suitable for planning, studying possible solution options, discussing with the customer, etc.

If you are a developer and know UML, it is possible that even some preliminary solutions will be more convenient for you to create in this notation. And for a business consultant, DFD will always be more convenient as a tool, since a business consultant does not need a detailed description of functions from an automation point of view; this is the task of technical specialists. But DFD saves a lot of time and effort.

However, DFD should not be considered as a simplified version of UML. Despite the similarity in approach, these are different tools intended for different purposes.

How many elements can be used in a DFD?

Unlike systems with rigid syntax and regulations, in DFD there is no limit on the number of elements that can be on one diagram. For comparison: in IDEF0 there is a number of such elements, then there is only detailing (decomposition) or different notations.
On the one hand, this is a big plus, since the absence of restrictions gives maximum freedom and comfort when drawing up notation. On the other hand, it is not recommended to abuse this freedom. Remember, the more elements you have in a diagram, the more difficult it is to read.

Can DFD notations be used to work with clients?

In principle, no one can prohibit doing this. Moreover, in limited quantities, as an illustration to some of your explanations, such notations are perfect when discussing the features of the project with the client. But still, clients usually have little understanding of automation issues, data storage structure, processing capabilities, etc. This is all within the competence of the developers. And DFD notations are built taking into account the specifics of working with data, so I still recommend using them mainly when discussing a project with specialists, when creating a technical description and assignment for developers, to increase the developers’ understanding of the essence and features of the project. Even explaining the features of DFD notations to an unprepared customer can be difficult.

processes connected by data flows. Data Flow Diagrams show how each process transforms its inputs into outputs, and reveal the relationships between these processes. DFD diagrams are successfully used as an addition to the IDEF0 model to describe document flow and information processing. Like IDEF0, DFD represents the system being modeled as a network of related activities. The main components of DFD (as mentioned above) are processes or activities, external entities, data streams , data storage devices(storage).

Rice. 8.8.

BPwin uses Hein-Sarson notation to create data flow diagrams.

In order to supplement the IDEF0 model with a DFD diagram, you need to “click” on the DFD radio button during the decomposition process in the Activity Box Count dialog. New buttons appear in the tool palette on the new DFD diagram:

(External Reference) - add an external link to the diagram;
(Data store) - add to diagram data store ;
Diagram Dictionary Editor– link to another page. Unlike IDEF0, this tool allows you to direct the arrow to any diagram (not just the top level).

Unlike IDEF0 arrows, which represent rigid relationships, DFD arrows show how objects (including data) move from one job to another. This is a representation of threads together with data warehouses And external entities makes DFD models more similar to the physical characteristics of the system - movement of objects, storage of objects, delivery and distribution of objects (Fig. 8.9).

Unlike IDEF0, which views a system as interrelated activities, DFD views a system as a collection of items. A context diagram often includes works and external links. Works are usually referred to by the name of the system, e.g. "Information Processing System". Including external links in context diagram does not negate the requirement of the methodology to clearly define the goal, scope and single point of view on the modeled system.

Rice. 8.9.

In DFD work(processes) are system functions that transform inputs into outputs. Although the works are depicted as rectangles with rounded corners, their meaning is the same as the meaning of the IDEF0 and IDEF3 works. Just like IDEF3 processes, they have inputs and outputs, but do not support controls and mechanisms like IDEF0 (Fig. 8.9) (blocks “Checking and entering customers”, “Entering orders”).

External Entities depict system logins and/or logouts. External Entities are depicted as a rectangle with a shadow and are usually located along the edges of the diagram (Fig. 8.9, block “Customer calls”). external entity can be used multiple times on one or more diagrams. This technique is usually used to avoid drawing too long and confusing arrows.

Work flows are depicted arrows And describe the movement of objects from one part of the system to another. Because in DFD each side of the work does not have a clear purpose, as in IDEF0, arrows can come in and out of any face of the work rectangle. DFD also uses bidirectional arrows to describe command-response dialogs between jobs, between jobs, and external entity and between external entities(Fig. 8.9).

Unlike arrows that describe objects in motion,

Data Flow Diagrams(Data Flow Diagrams - DFD) represent a hierarchy of functional processes connected by data flows. The purpose of such a representation is to demonstrate how each process transforms its inputs into outputs, as well as to reveal the relationships between these processes.

Two different notations are traditionally used to construct DFDs, corresponding to the Jordan-DeMarco and Gain-Sarson methods. These notations differ slightly from each other in the graphic representation of symbols (hereinafter, in the examples, the Gein-Sarson notation is used).

In accordance with this method, the system model is defined as a hierarchy of data flow diagrams that describe the asynchronous process of transforming information from its input into the system to its delivery to the consumer. Sources of information (external entities) generate information flows (data flows) that transfer information to subsystems or processes. Those, in turn, transform information and generate new flows that transfer information to other processes or subsystems, data storage devices or external entities - information consumers.

Diagrams at the top levels of the hierarchy (context diagrams) define the main processes or subsystems with external inputs and outputs. They are detailed using lower level diagrams. This decomposition continues, creating a multi-level hierarchy of diagrams, until a decomposition level is reached at which it makes no sense to detail the processes further.

Composition of Data Flow Diagrams

The main components of data flow diagrams are: external entities; systems and subsystems; processes; data storage devices; data streams.

External Entity represents a material object or individual that is the source or receiver of information, for example, customers, personnel, suppliers, clients, warehouse. Defining an object or system as an external entity indicates that it is outside the boundaries of the system being analyzed. During the analysis process, some external entities can be transferred inside the diagram of the analyzed system, if necessary, or, conversely, some processes can be moved outside the diagram and presented as an external entity.

The external entity is indicated by a square (Fig. 1), located above the diagram and casting a shadow on it so that this symbol can be distinguished from other designations.

Figure 1. Graphical representation of an external entity

When building a model of a complex system, it can be presented in the most general form on the so-called context diagram in the form of one system as a single whole, or it can be decomposed into a number of subsystems. A subsystem (or system) is depicted on a context diagram as it is shown in Fig. 2.

Figure 2. Subsystem for working with individuals (GNI - State Tax Inspectorate)

The subsystem number serves to identify it. In the name field, enter the name of the subsystem in the form of a sentence with a subject and corresponding definitions and additions.

Process represents the transformation of input data streams into output ones in accordance with a certain algorithm. Physically, the process can be implemented in various ways: it can be a division of the organization (department) that processes input documents and issues reports, a program, a hardware-implemented logical device, etc. The process in a data flow diagram is depicted as shown in Fig. 3.

Figure 3. Graphical representation of the process

The process number serves to identify it. In the name field, enter the name of the process in the form of a sentence with an active, unambiguous verb in the indefinite form (calculate, calculate, check, determine, create, receive), followed by nouns in the accusative case, for example: “Enter information about taxpayers”, “Issue information about current expenses", "Check the receipt of money". The information in the physical implementation field indicates which organizational unit, program, or hardware device is executing the process.

Data storage- this is an abstract device for storing information that can be placed in a drive at any time and retrieved after some time, and the methods of storing and retrieving can be any. The data storage device can be implemented physically in the form of a microfiche, a box in a file cabinet, a table in RAM, a file on magnetic media, etc. The data store in the data flow diagram is depicted as shown in Fig. 4.

Figure 4. Graphical representation of the data storage device

The data storage device is identified by the letter "D" and an arbitrary number. The drive name is chosen to be most informative for the designer. In general, a data storage device is a prototype of a future database, and the description of the data stored in it must correspond to the data model.

Data stream defines information transmitted through some connection from a source to a receiver. The actual data stream can be information transmitted over a cable between two devices, letters sent by mail, magnetic tapes or floppy disks transferred from one computer to another, etc.

The flow of data in the diagram is represented by a line ending with an arrow that shows the direction of the flow (Fig. 5).

Each data stream has a name that reflects its content.

Figure 5. Data flow

Building a hierarchy of data flow diagrams

The main goal of constructing a DFD hierarchy is to make the system description clear and understandable at each level of detail, and to break it down into parts with precisely defined relationships between them. To achieve this, it is advisable to use the following recommendations:

Place from 3 to 6-7 processes on each diagram (similar to SADT). The upper limit corresponds to the human ability to simultaneously perceive and understand the structure of a complex system with many internal connections, the lower limit is chosen for reasons of common sense: there is no need to detail the process with a diagram containing only one or two processes.

Do not clutter the diagrams with details that are unimportant at this level.

Decomposition of data streams is carried out in parallel with decomposition of processes. These two jobs should be done simultaneously, not one after the other is completed.

Choose clear, descriptive names for processes and threads, and try not to use abbreviations.

The first step in constructing a DFD hierarchy is to construct context diagrams. Typically, when designing relatively simple systems, a single context diagram is built with a star topology, at the center of which is the so-called main process, connected to the sinks and sources of information through which users and other external systems interact with the system. Before building a contextual DFD, it is necessary to analyze external events (external entities) that influence the functioning of the system. The number of threads in a context diagram should be as small as possible, since each of them can be further broken down into several threads at subsequent levels of the diagram.

To test the context diagram, you can create a list of events. The list of events should consist of descriptions of the actions of external entities (events) and the corresponding system reactions to events. Each event must correspond to one or more data streams: input streams are interpreted as impacts, and output streams are interpreted as system reactions to input streams.

For complex systems (signs of complexity may be the presence of a large number of external entities (ten or more), the distributed nature of the system or its multifunctionality), a hierarchy of context diagrams is built. At the same time, the top-level context diagram contains not a single main process, but a set of subsystems connected by data flows. The next level of context diagrams details the context and structure of subsystems. For each subsystem present on the context diagrams, it is detailed using DFD. This can be done by plotting a chart for each event. Each event is represented as a process with associated input and output streams, data stores, external entities, and references to other processes to describe the relationships between that process and its environment. Then all the constructed diagrams are combined into one zero-level diagram.

Each process on a DFD, in turn, can be detailed using a DFD or (if the process is elementary) a specification. The process specification should formulate its main functions in such a way that in the future the specialist implementing the project will be able to perform them or develop an appropriate program.

The specification is the final top of the DFD hierarchy. The decision to complete the process detailing and use of the specification is made by the analyst based on the following criteria:

The process has a relatively small number of input and output data streams (2-3 streams);

Possibility of describing the transformation of data processes in the form of a sequential algorithm;

The process performs a single logical function of converting input information into output;

Possibility of describing the process logic using a small specification (no more than 20-30 lines).

Specifications are descriptions of algorithms for tasks performed by processes. They contain the number and/or name of the process, lists of input and output data, and the body (description) of the process, which is the specification of an algorithm or operation that transforms input data streams into output ones. Specification languages can range from structured natural language or pseudocode to visual modeling languages.

Structured natural language is used to describe process specifications in a clear, sufficiently rigorous manner. The following conventions are accepted when using it:

Process logic is expressed as a combination of sequential constructs, choice constructs, and iterations;

Verbs should be active, unambiguous and action-oriented (fill, calculate, extract, not upgrade, process);

The logic of the process must be expressed clearly and unambiguously.

When building a DFD hierarchy, you should proceed to detailing processes only after determining the content of all flows and data drives, which is described using data structures. For each data stream, a list of all its data elements is generated, then the data elements are combined into data structures that correspond to larger data objects (for example, document strings or domain objects). Each object must consist of elements that are its attributes. Data structures can contain alternatives, conditional occurrences, and iterations. A conditional occurrence means that the component may not be present in the structure (for example, the insurance data structure for the employee object). Alternative means that the structure may include one of the listed elements. Iteration means the occurrence of any number of elements in a specified range (for example, the element "child's name" for the object "employee"). For each data element, its type (continuous or discrete data) can be specified. For continuous data, the unit of measurement, range of values, precision of presentation, and form of physical encoding may be specified. For discrete data, a table of acceptable values can be specified.

In business process modeling, data flow diagrams (DFDs) are used to construct AS-IS and AS-TO-BE models, thus reflecting the existing and proposed structure of an organization's business processes and the interactions between them. In this case, the description of the data used in the organization at a conceptual level, independent of the means of implementing the database, is carried out using the “entity-relationship” model.

Listed below are the main types and sequence of work when building business models using the Yordon methodology:

1. Description of the process context and construction of the initial context diagram.

The initial contextual data flow diagram should contain process zero with a name that reflects the organization's activities, external entities connected to process zero through data flows. Data flows correspond to documents, requests, or messages that external entities exchange with an organization.

2. Specification of data structures.

The composition of data streams is determined and initial information is prepared for constructing a conceptual data model in the form of data structures. All structures and data elements of the "iteration", "conditional occurrence" and "alternative" types are highlighted. Simple structures and data elements are combined into larger structures. As a result, for each data flow a hierarchical (tree) structure must be formed, the final elements (leaves) of which are data elements, the tree nodes are data structures, and the top node of the tree corresponds to the data flow as a whole.

3. Construction of the initial version of the conceptual data model. For each class of domain objects, an entity is allocated. Connections between entities are established and their characteristics are determined. An entity-relationship diagram is constructed (without entity attributes).

4. Construction of data flow diagrams of zero and subsequent levels.

To complete the analysis of the functional aspect of the organization's activities, the initial context diagram is detailed (decomposed).

In this case, you can build a diagram for each event, assigning a process to it and describing input and output streams, data drives, external entities and links to other processes to describe the connections between this process and its environment. After this, all constructed diagrams are combined into one zero-level diagram.

Processes are divided into groups that have much in common (work with the same data and/or have similar functions). They are depicted together on a lower (first) level diagram, and on a zero level diagram they are combined into one process. Data storage devices used by processes from the same group are allocated.

Complex processes are decomposed and the compliance of various levels of the process model is checked.

Data storage devices are described through data structures, and lower-level processes are described through specifications.

5. Refinement of the conceptual data model.

Entity attributes are defined. Identifier attributes are highlighted. Connections are checked and supertype-subtype connections are identified (if necessary). The correspondence between the description of data structures and the conceptual model is checked (all data elements must be present on the diagram as attributes).