Directory Structure

Directory Structure

Script Requirements

Script Requirements

All of the following scripts must be present in the same directory (script_dir):

 

class_fit2.py

class_hb.py

class_titr.py

create_titr.py

fitting2.py

fitting_models.py

make_figinstr2.py

misc_functions.py

program_scripts.py

settings.py

figures2.py

 

To run scripts, you must execute them by typing: "python script_dir/script.py"

General Flow

General Flow

Directories:

 

Files:

1. Create a directory where the analysis will take place (/analysis)

2. Prepare peak list files.

· Sparky Peak List Files: For each spectrum in Sparky, create a peak list file (use the lt command) and save it in the analysis directory.  Files must have a ‘.list’ extension.

· Sparky Save Files: Using the Harms’ sparky_extract-peaks.py program, generate a peak summary file with extension ‘.mjh’.

· Manually created files: Create your own titration file.  File must be tab-delimited and the first row of the file must contain headers.  The file name must end with ‘CS-pH.txt’.

3. Create a titration.log file in the analysis directory.  This file will store the information about the titration – protein, salt condition, temperature, magnet, and the name of the titration experiment.  The name of the experiment is the most critical, as it will be used by downstream programs for plotting purposes.

4. In the analysis directory, run create_titr.py This script will generate.

# Fit control file example

#

# 20060515

#

fit_type: one_site_mHb, fit_program: R, atom_type: CG, res_type: ASP, which: all

ASP_19, not-fitable ASP_21, not-fitable

ASP_40, CS_AH1: 175.37, CS_A_: 179.10, pKa1: 3.85, n1: 0.57

ASP_77, not-fitable ASP_83, not-fitable

ASP_95, not-fitable

ASP_143, CS_AH1: 176.72, CS_A_: 180.08, pKa1: 3.78, n1: 0.76

ASP_146, CS_AH1: 176.84, CS_A_: 180.37, pKa1: 3.85, n1: 0.76

end

fit_type: two_site_b, fit_program: R, atom_type: CG, res_type: ASP, which: select

ASP_21, CS_AH2: 176.60, CS_AH1: 176.20, CS_A_: 179.87, pKa1: 3.01, pKa2: 6.54

end

If a particular resonance can not be fitted (e.g. Asp-77 chemical shift exhibits no pH dependence) then the line should read ‘ASP_77, not-fitable’).

 

The format of a fit instruction line should be:
[residue type]_[residue number], [parameter_name]: [value], [parameter_name]: [value]
example:


 


GLU_43, CS_AH1: 179.69, CS_A_: 184.07, pKa1: 4.32, n1: 0.70

ASP_95, fixed~CS_AH1: 178.01, CS_A_: 181.66, pKa1: 2.0, n1: 0.8

Lines that are not read by fitting.py begin with ‘#’. The file is broken up into sections that begin with ‘fit_type’ and end with ‘end’.  Each section uses a different fit model and fits different residue  and atom types.  There are five essential parameters on the fit_type line:

To fix a parameter, please add “fixed~” to the front of the parameter.

6. Run fitting2.py in the analysis directory. This script uses the gnls function in R. This script will read the .fit files and create

 

7. Create a subdirectory called ‘plots’.

8. Copy (or symbolically link) the titr_fit.pickle file (if already fitted) or the titr.pickle file to the plots directory

9. Within the plots directory, create a figure instruction file by running make_figinstr2.py.  This script will create a file called instructions.txt with plot parameters.  All of the fits performed will be listed in this file

 

10. Edit the instructions.txt file.  Here is an example file:

rows: 5, columns: 4, page_margins: 0.5_0.5_0.5_0.5

x_range: 1_9, y_amplitude: 4.5, plot_margins: 0.15_0.15_0.15_0.15, show_pKa: y

name: DPPHS_25C_0p10M-KCl_2

figure_1

ASP_19_CG, DPPHS_25C_0p10M-KCl_2: two_site_b--fixed~CS_AH2:177.43

figure_2 ASP_21_CG, DPPHS_25C_0p10M-KCl_2: two_site_b

Instructions for each data set to be plotted is written as follows:
[descriptor], [titration_name]: [fit_name]

GLU_101_CD, DPPHS_25C_0p10M-KCl_2: one_site_mHb

The descriptor contains three fields: residue_type, residue_number, and atom_type.  These fields are combined into one string using ‘_’.

 

If multiple fits are to be plotted for the same data set, the fit names must be separated by a comma:

one_site_mHb, one_site_mHb--fixed~CS_AH1:179.18

If multiple data sets are to be plotted on the same figure, they only need to be defined on separate lines:

figure_6
ASP_95_CG, DPPHS_25C_0p10M-KCl_3: one_site_mHb--fixed~CS_AH1:178.02
ASP_95_CG, DPPHS_25C_1p00M-KCl_1: one_site_mHb

11. Edit settings.py to incorporate the plot settings for that titration data set:

12. Run figures2.py by typing "figures2.py instructions.txt" in the plots directory.  This will create a instructions_.ps file which you can open in Ghostscript/Ghostview (on Windows) or Preview (on Macs) or display (on Unix).

'DPPHS_25C_0p10M-KCl_1' : {'color': 'black', 'pch': 21, 'bg': 'red'}

NMR Data fitting tool

Documentation

analysis

titration peak lists analyzed for pKa determination and plots of chemical shift data vs. pH.

postscripts

postscript files of spectra (Sparky’s printing service was used for these)

sparky_save

Sparky .save files for each spectrum recorded.

data_files

 

fit_files

 

spectra

Raw NMR data for each experiment. 

Raw Spectrometer FID data

fid

Raw Varian binary data or raw Bruker binary data for 1D experiments

ser

Raw Bruker binary data for multi-dimensional experiments

Acquisition parameter files

procpar

Varian acquisition parameter files

acqu, acqus, acqu2, acqu2s, etc.

Bruker acquisition files for each dimension before and after data acquisition.

NMRPipe processing scripts

fid.com

convert from spectrometer data to NMRPipe format

ft2.com

convert NMRPipe FID data into FT-processed data.  This script is commonly used for processing HSQC experiments.

ipap_ft2.com

convert NMRPipe ipap_proc FID data into FT-processed data.  This script is commonly used for processing CBCGCO IPAP experiments.

Spectra for analysis

*.ucsf

spectral files for analysis in Sparky – All NMRPipe spectra were converted to ucsf format using the Sparky pipe2ucsf script.)

*.save

 

Software

Version

Website

Python

2.6

http://www.python.org/download/

R

>= 2.10.0

http://www.r-project.org/

RPy2

>= 2.1.0

http://rpy.sourceforge.net/

nlme

any

http://cran.r-project.org/web/packages/nlme/index.html

Pywin32 extension (Win)

any

http://sourceforge.net/projects/pywin32/

fit_type

one_site_mHb, one_site or two_site_b (these are defined in fitting_models.py)

fit_program

R

atom_type

HN, N, CO, CA, CB, CG, CD, HB2, HB3, etc.

res_type

ASP, GLU, HIS, etc.

which

select or all (‘select’ means that only those resonances listed below the fit_type line will be fitted, while ‘all’ means that all resonances with the same residue type and atom type will be fitted.  If fitting parameters are not listed for some, default initial guesses will be used.  Initial guesses are listed in the fitting_models.py file.

titr_fit_summary.txt

A tab-delimited text file listing initial guesses, fit name, final fit parameters and standard errors

fit_files directory

Director which contains fit summary files for each resonance

Python pickle

A text file that stores all of the titration data and fit data into one file read by subsequent tables.py and figures2.py scripts

Plot Type

make_figinstr.py input parameters

D, E, COOH using Cg/Cd

COOH

D, E, COOH using Cb/Cg

COOH CBCG

D, E, COOH using Hb2/Hg2

COOH H2

D, E, COOH using Hb3/Hg3

COOH H3

HN

HN

Parameter

Description

rows and columns

specifies the layout of how the figures should be plotted on a page

page_margins

specifies outer margins outside the figure plotting area (margins are listed in the order left, bottom, right, up, and separated by ‘_’

x_range

COOH H2

y_amplitude

COOH H3

plot_margins

HN

show_pKa

specifies whether the pKa value from a fit will be displayed or not. name: defines the title of the page