Data Integration
Data Integration Overview
We will cover two types of data integration:
-
Simple Feed Integration
-
Web Services
All new Demandware clients need to integrate data from other data systems.
Data Integration: Simple Feeds
The idea of Simple Feed Integration is to couple Demandware and external systems loosely by exchanging files on a File Transfer Server. Simple Feed Integration supports the protocols WebDAV (HTTP and HTTPS) and SFTP for the transfer with the File Transfer Server. WebDAV HTTP (no network layer encryption加密) is meantfor development/testing purposes only. When considering WebDAV HTTPS, please mind that an SSL-certificate from a Demandware accepted Certificate Authority (CA) needs to be installed at the File Transfer Server. The list of accepted CAs is available at Demandware’s Customer Central.
The File Transfer Server needs to be hosted by the customer or another 3rd party. Demandware does not offer hosting File Transfer Servers. The WebDAV access to certain folders on a Demandware instance cannot be used for that purpose.
File Format
The file format for the incoming and outgoing feeds is the standard Demandware import/export format. For XML files the schema definitions (XSD files) can be downloaded from Customer Central. A good approach to create a sample file is to set up the data as desired in Business Manager, run an export from the Business Manager user interface and use the exported file as a template.
All import files can alternatively(选择性) be provided in gzip format (not to be confused 混淆 with zip). In that case provide the file with extensions .xml.gz or .csv.gz at the File Transfer Server and adjust the file matching rule. The validation and import processes will automatically unzip the files during processing.
Jobs
To perform an execution of functionality jobs within Demandware come as three types: scheduled, import/export, and batch. You can schedule jobs to run according to a schedule or for one time only. Demandware also provides a means of detecting (监测) problems with job execution, notification of job status and recovery from failed jobs. For instance, if a job fails to execute successfully, you can send an e-mail notification to alert someone to the problem. This feature enables job failure rules to be set, and jobs to automatically re-run if execution fails. You can set job failure rules and retry parameters such as number of retries (重试次数).
You can define jobs that execute on the global (or organization) level and jobs that execute only on a site level through : Administration > Operations.
Import into Production
Technically all object types can be imported on all Demandware instance types. Importing data directly into the instance Production may have adverse(不利的影响) effects as there is no preview or rollback functionality available. Please load all feeds that can be previewed into instance Staging for approval and use Demandware’s data replication feature for promotion to Production. Data replication also comes with rollback functionality.
Consider the different Catalog import modes that the catalog/pricebook/inventory imports support:
-
A+a an object with a new attribute
-
B –a an object with an attribute removed.
-
C an object left unchanged
-
D a new object
-
FΔa an object with an attribute value that is changed
Some imports support an attribute mode at the import element level. In this case, the only supported mode is DELETE, whereby(凭借) the import mode for the process can be overwritten for a particular element. This is useful when feeding changed information to a Demandware system, allowing a single import process to create, update and delete objects.
You can overwrite the XML file import mode for a single element, but only to specify DELETE:
<product product-id="12345" mode="delete"/>
When considering imports directly into Production mind that Demandware imports typically commit changes to the database on a per object basis. This means during the execution of the import some updates may already be visible in the storefront, others not. The catalog import is a two pass import: during the first pass objects are updated, during the second pass relationships between objects are updated. It is not recommended running catalog import directly into Production with Simple Feed Integration.
Simple Feed Cartridges
The two cartridges included in the Simple Feed integration are:
-
int_simplefeeds: Implementation of generic logic. Needs to be assigned to the storefront site and the BM site. Modification of the Custom_FeedJobs pipeline is necessary to use the feed configuration custom object. Do not modify other code.
-
test_simplefeeds: Cartridge to help troubleshooting WebDAV or SFTP connection issues and trigger a Simple Feed Integration job from the storefront during development. This cartridge must not be assigned to a site on Production systems. It may be assigned to the storefront site on sandbox instances if the storefront is password protected.
WebDAV is the only protocol that DW supports where DW can access files external to Demandware and where external systems can push files to DW. It is the only protocol that works on both directions.
You cannot use SFTP to access files in DW: there is no SFTP server in DW instances. However, DW can access an external SFTP server.
IMPORTANT NOTE
It is not advisable to expose the RunJobNow pipelet in a public pipeline as this could be exploited to flood your job queue. If you must use this pipelet in the storefront, please implement a custom security mechanism so that only authorized requests can execute the pipelet.
Data Integration: Web Services
When you need real-time data integration with a 3rd party system or your own backend, you need to invoke a web service.
A web service is a remote service created to expose(暴露) data in a database, or the result of a calculation. It is a remote procedure call that goes through HTTP. There are a number of supported protocols for web services but we will focus on SOAP style web services.
In Demandware, Axis 1.4 is used to compile (编译) WSDL files. As of this writing SOAP 1.1 and since 14.4 also SOAP 1.2 is supported. In Demandware, once the WSDL is uploaded all the supporting class files get automatically generated. To access these files you use the Demandware dw.rpc.webreference class. An instance of this class represents the package of all the generated WSDL classes:
var webref : WebReference = webreference.CurrencyConvertor;
The webreference class has one property, called defaultService. The defaultService property returns a stub:
var stub : Stub = webref.defaultService;
You can also invoke a specific service if you know its name and port:
webref.EducationServiceSOAP.getService('EducationService', 'EducationServiceSOAP')
The Stub instance allows you to access remote methods in your web service
var response : ConversionRateResponse = stub.conversionRate(cr);
Generating WSDL Javadocs
The graphical presentation of a WSDL in Studio does not always make it clear how to invoke methods on the Web Service. Optionally, you can generate Java classes and javadoc to help you write the Demandware Script code:
-
Download the Axis 1.4 and Xerces jar files from Apache.
-
Create an AXISCLASSPATH environment variable to point to the
-
downloaded jars.
-
Generate Java stubs using the command:
Java –cp
%AXISCLASSPATH% org.apache.axis.wsdl.WSDL2Java
--noWrapped --all --package currencyconvertor
CurrencyConvertor.wsdl
-
Generate the Javadoc for use in coding:
javadoc -d currencyconvertor\javadoc currencyconvertor
WSDL File
A WSDL file is a document that describes a web service. It specifies the location of the service and the operations (or methods) the service exposes.
You can view the WSDL file either graphically or programmatically:
The WSDL file is all that is needed for the Demandware server to generate the java classes for accessing the methods in the remote service. You will need to write a script file for accessing those classes using the Demandware API.
Integrate a web service
-
Get the WSDL file for the web service you wish to invoke.
-
Place the WSDL file in the webreferences folder in your cartridge.
-
Using a script pipelet, add the following to your script:
-
Import the dw.rpc package
-
Create a webreference object (example below):
Var webref : WebReference = webreferences.WSDLFileName
-
Get service stub object (example below):
var stub : Stub = webref.defaultService
4 . Invoke the methods/properties from the web service you need. Generate Javadocs from the WSDL if you need help invoking the remote method(s).
相关推荐
Pentaho Data Integration(原Kettle)和 Data Reporting工具最新版9.0.0.2.530百度云下载地址 ETL工具及报表展示工具
Kettle 改名啦!!! 现在叫 Pentaho Data Integration Kettle9.4版本 Pentaho Data Integration 9.4 PDI 9.4 下载地址: ...
Pentaho Data Integration - Java API Examples.docPentaho Data Integration - Java API Examples.docPentaho Data Integration - Java API Examples.doc
使用Pentaho Data Integration 5.0.3整合SQL Server数据库和Salesforce数据的一个Demo,包括ppt,kettle文件及所有数据库backup文件及参考文档及代码。
文档里有针对Talend Open Studio For Data integration (ETL) 的高清视频教程下载地址。有需要的人可以自行下载。
Pentaho 3.2 Data Integration- Beginner's Guide code 源码
SAS Data Integration Studio 4.5 Users Guide.pdf
Pentaho Data Integration 4 Cookbook原书里面的sampledata数据库不能使用了,我改了下,这是能够导入到mysql里面的
Pentaho Data Integration Beginner's Guide, Second Edition英文书籍章节的示例代码文件
Pentaho Data Integration Quick Start Guide(2018) Chapter 5 and 6 no code files. 包含Chapter1~Chapter4
Pentaho 3.2 Data Integration and Spoon_3_0_0_User_Guide
很好的一本介绍数据整合的书,由数据整合的几个牛人编写。
SQL Server 2012 Data Integration Recipes provides focused and practical solutions to real world problems of data integration. Need to import data into SQL Server from an outside source? Need to export...
I Heart Logs Event Data, Stream Processing, and Data Integration
深入浅出地介绍数据整合地书,内容翔实,随书附赠地ppt可以在home page:http://research.cs.wisc.edu/dibook/ 找到
此为kettle工具的一个pdf文档说明手册,仅供参考,备注:Kettle是一款国外开源的ETL工具,纯java编写,可以在Window、Linux、Unix上运行,绿色无需安装,数据抽取高效稳定。它允许你管理来自不同数据库的数据,通过...
1 kettle_part1 2kettle_part2 3kettle培训手册 4kettel使用培训文档ppt